All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] drm/nouveau: support for GK20A, cont'd
@ 2014-03-24  8:42 ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	gnurou-Re5JQEeQqe8AvxtiuMwx3w, Alexandre Courbot

Hi everyone,

Here is the second batch of patches to add GK20A support to Nouveau. This time
we are adding the actual chip support, and this series brings the driver to a
point where a slightly-tweaked Mesa successfully runs shaders and renders
triangles on GBM! Many thanks to Thierry Reding and the people on the
#nouveau IRC channel for their help without which we would not have reached
this milestone.

A few lines of hacks (not included here) are still needed to deal with cached
mappings triggering external aborts and CPU/GPU memory coherency issues, but I
hope to understand and address these issues next.

Most of the changes below have already been seen (and sometimes reviewed) in an
earlier patchset. What has been added is proper PGRAPH support (still needing
an external firmware and mostly reusing NVE4's code) as well as a better RAM
implementation.

How to represent and manage VRAM has been the hardest part to deal with, since
GK20A shares the system memory with the CPU without any kind of partition. I
have tried various approaches (included some in which no RAM object ever gets
instanciated) and finally decided to go for one using DMA-contiguous memory
allocations and relying on BAR mappings for kernel access and exposure to
user-space, as it fits better with existing code and keeps us safe from most of
the CPU/GPU memory coherency issues (at the cost of some performance).

Looking forward to your review of these few patches! :)

Cheers,
Alex.

Alexandre Courbot (12):
  drm/nouveau: fix missing newline
  drm/nouveau/timer: skip calibration on GK20A
  drm/nouveau/bar: only ioremap BAR3 if it exists
  drm/nouveau/bar/nvc0: support chips without BAR3
  drm/nouveau/fifo: add GK20A support
  drm/nouveau/ibus: add GK20A support
  drm/nouveau/fb: add GK20A support
  drm/nouveau/graph: enable when using external firmware
  drm/nouveau/graph: pad firmware code at load time
  drm/nouveau/graph: add GK20A support
  drm/nouveau: support GK20A in nouveau_accel_init()
  drm/nouveau: support for probing GK20A

 drivers/gpu/drm/nouveau/Makefile                   |   5 +
 drivers/gpu/drm/nouveau/core/engine/device/nve0.c  |  20 +++
 drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h    |   1 +
 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c    |  35 +++++
 .../gpu/drm/nouveau/core/engine/graph/ctxnve4.c    |   4 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c   |  12 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h   |   9 ++
 drivers/gpu/drm/nouveau/core/engine/graph/nve4.c   |   2 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c   |  75 +++++++++
 drivers/gpu/drm/nouveau/core/include/engine/fifo.h |   1 +
 .../gpu/drm/nouveau/core/include/engine/graph.h    |   1 +
 drivers/gpu/drm/nouveau/core/include/subdev/fb.h   |   1 +
 drivers/gpu/drm/nouveau/core/include/subdev/ibus.h |   1 +
 drivers/gpu/drm/nouveau/core/subdev/bar/base.c     |   7 +-
 drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c     | 101 +++++++------
 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c      |  56 +++++++
 drivers/gpu/drm/nouveau/core/subdev/fb/priv.h      |   1 +
 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c   | 168 +++++++++++++++++++++
 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c    | 110 ++++++++++++++
 drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c   |  19 ++-
 drivers/gpu/drm/nouveau/nouveau_drm.c              |  12 +-
 21 files changed, 578 insertions(+), 63 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c

-- 
1.9.1

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 00/12] drm/nouveau: support for GK20A, cont'd
@ 2014-03-24  8:42 ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau, dri-devel, linux-tegra, linux-kernel,
	gnurou, Alexandre Courbot

Hi everyone,

Here is the second batch of patches to add GK20A support to Nouveau. This time
we are adding the actual chip support, and this series brings the driver to a
point where a slightly-tweaked Mesa successfully runs shaders and renders
triangles on GBM! Many thanks to Thierry Reding and the people on the
#nouveau IRC channel for their help without which we would not have reached
this milestone.

A few lines of hacks (not included here) are still needed to deal with cached
mappings triggering external aborts and CPU/GPU memory coherency issues, but I
hope to understand and address these issues next.

Most of the changes below have already been seen (and sometimes reviewed) in an
earlier patchset. What has been added is proper PGRAPH support (still needing
an external firmware and mostly reusing NVE4's code) as well as a better RAM
implementation.

How to represent and manage VRAM has been the hardest part to deal with, since
GK20A shares the system memory with the CPU without any kind of partition. I
have tried various approaches (included some in which no RAM object ever gets
instanciated) and finally decided to go for one using DMA-contiguous memory
allocations and relying on BAR mappings for kernel access and exposure to
user-space, as it fits better with existing code and keeps us safe from most of
the CPU/GPU memory coherency issues (at the cost of some performance).

Looking forward to your review of these few patches! :)

Cheers,
Alex.

Alexandre Courbot (12):
  drm/nouveau: fix missing newline
  drm/nouveau/timer: skip calibration on GK20A
  drm/nouveau/bar: only ioremap BAR3 if it exists
  drm/nouveau/bar/nvc0: support chips without BAR3
  drm/nouveau/fifo: add GK20A support
  drm/nouveau/ibus: add GK20A support
  drm/nouveau/fb: add GK20A support
  drm/nouveau/graph: enable when using external firmware
  drm/nouveau/graph: pad firmware code at load time
  drm/nouveau/graph: add GK20A support
  drm/nouveau: support GK20A in nouveau_accel_init()
  drm/nouveau: support for probing GK20A

 drivers/gpu/drm/nouveau/Makefile                   |   5 +
 drivers/gpu/drm/nouveau/core/engine/device/nve0.c  |  20 +++
 drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h    |   1 +
 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c    |  35 +++++
 .../gpu/drm/nouveau/core/engine/graph/ctxnve4.c    |   4 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c   |  12 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h   |   9 ++
 drivers/gpu/drm/nouveau/core/engine/graph/nve4.c   |   2 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c   |  75 +++++++++
 drivers/gpu/drm/nouveau/core/include/engine/fifo.h |   1 +
 .../gpu/drm/nouveau/core/include/engine/graph.h    |   1 +
 drivers/gpu/drm/nouveau/core/include/subdev/fb.h   |   1 +
 drivers/gpu/drm/nouveau/core/include/subdev/ibus.h |   1 +
 drivers/gpu/drm/nouveau/core/subdev/bar/base.c     |   7 +-
 drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c     | 101 +++++++------
 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c      |  56 +++++++
 drivers/gpu/drm/nouveau/core/subdev/fb/priv.h      |   1 +
 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c   | 168 +++++++++++++++++++++
 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c    | 110 ++++++++++++++
 drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c   |  19 ++-
 drivers/gpu/drm/nouveau/nouveau_drm.c              |  12 +-
 21 files changed, 578 insertions(+), 63 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c

-- 
1.9.1


^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 01/12] drm/nouveau: fix missing newline
  2014-03-24  8:42 ` Alexandre Courbot
@ 2014-03-24  8:42     ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

Add a missing newline at the end of a DRM_INFO message.

Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/nouveau/nouveau_drm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index b2a674531fba..ef27949057c3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -322,7 +322,7 @@ nouveau_get_hdmi_dev(struct drm_device *dev)
 	struct pci_dev *pdev = dev->pdev;
 
 	if (!pdev) {
-		DRM_INFO("not a PCI device; no HDMI");
+		DRM_INFO("not a PCI device; no HDMI\n");
 		drm->hdmi_device = NULL;
 		return;
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 01/12] drm/nouveau: fix missing newline
@ 2014-03-24  8:42     ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau, dri-devel, linux-tegra, linux-kernel,
	gnurou, Alexandre Courbot

Add a missing newline at the end of a DRM_INFO message.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_drm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index b2a674531fba..ef27949057c3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -322,7 +322,7 @@ nouveau_get_hdmi_dev(struct drm_device *dev)
 	struct pci_dev *pdev = dev->pdev;
 
 	if (!pdev) {
-		DRM_INFO("not a PCI device; no HDMI");
+		DRM_INFO("not a PCI device; no HDMI\n");
 		drm->hdmi_device = NULL;
 		return;
 	}
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
  2014-03-24  8:42 ` Alexandre Courbot
@ 2014-03-24  8:42     ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

GK20A's timer is directly attached to the system timer and cannot be
calibrated. Skip the calibration phase on that chip since the
corresponding registers do not exist.

Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
index c0bdd10358d7..822fe0d8a871 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
@@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
 	if (ret)
 		return ret;
 
+	/* gk20a does not have the calibration registers */
+	if (device->chipset == 0xea)
+		goto skip_clk_init;
+
 	/* aim for 31.25MHz, which gives us nanosecond timestamps */
 	d = 1000000 / 32;
 
@@ -235,20 +239,23 @@ nv04_timer_init(struct nouveau_object *object)
 		d >>= 1;
 	}
 
-	/* restore the time before suspend */
-	lo = priv->suspend_time;
-	hi = (priv->suspend_time >> 32);
-
 	nv_debug(priv, "input frequency : %dHz\n", f);
 	nv_debug(priv, "input multiplier: %d\n", m);
 	nv_debug(priv, "numerator       : 0x%08x\n", n);
 	nv_debug(priv, "denominator     : 0x%08x\n", d);
 	nv_debug(priv, "timer frequency : %dHz\n", (f * m) * d / n);
-	nv_debug(priv, "time low        : 0x%08x\n", lo);
-	nv_debug(priv, "time high       : 0x%08x\n", hi);
 
 	nv_wr32(priv, NV04_PTIMER_NUMERATOR, n);
 	nv_wr32(priv, NV04_PTIMER_DENOMINATOR, d);
+
+skip_clk_init:
+	/* restore the time before suspend */
+	lo = priv->suspend_time;
+	hi = (priv->suspend_time >> 32);
+
+	nv_debug(priv, "time low        : 0x%08x\n", lo);
+	nv_debug(priv, "time high       : 0x%08x\n", hi);
+
 	nv_wr32(priv, NV04_PTIMER_INTR_0, 0xffffffff);
 	nv_wr32(priv, NV04_PTIMER_INTR_EN_0, 0x00000000);
 	nv_wr32(priv, NV04_PTIMER_TIME_1, hi);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
@ 2014-03-24  8:42     ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau, dri-devel, linux-tegra, linux-kernel,
	gnurou, Alexandre Courbot

GK20A's timer is directly attached to the system timer and cannot be
calibrated. Skip the calibration phase on that chip since the
corresponding registers do not exist.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
index c0bdd10358d7..822fe0d8a871 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
@@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
 	if (ret)
 		return ret;
 
+	/* gk20a does not have the calibration registers */
+	if (device->chipset == 0xea)
+		goto skip_clk_init;
+
 	/* aim for 31.25MHz, which gives us nanosecond timestamps */
 	d = 1000000 / 32;
 
@@ -235,20 +239,23 @@ nv04_timer_init(struct nouveau_object *object)
 		d >>= 1;
 	}
 
-	/* restore the time before suspend */
-	lo = priv->suspend_time;
-	hi = (priv->suspend_time >> 32);
-
 	nv_debug(priv, "input frequency : %dHz\n", f);
 	nv_debug(priv, "input multiplier: %d\n", m);
 	nv_debug(priv, "numerator       : 0x%08x\n", n);
 	nv_debug(priv, "denominator     : 0x%08x\n", d);
 	nv_debug(priv, "timer frequency : %dHz\n", (f * m) * d / n);
-	nv_debug(priv, "time low        : 0x%08x\n", lo);
-	nv_debug(priv, "time high       : 0x%08x\n", hi);
 
 	nv_wr32(priv, NV04_PTIMER_NUMERATOR, n);
 	nv_wr32(priv, NV04_PTIMER_DENOMINATOR, d);
+
+skip_clk_init:
+	/* restore the time before suspend */
+	lo = priv->suspend_time;
+	hi = (priv->suspend_time >> 32);
+
+	nv_debug(priv, "time low        : 0x%08x\n", lo);
+	nv_debug(priv, "time high       : 0x%08x\n", hi);
+
 	nv_wr32(priv, NV04_PTIMER_INTR_0, 0xffffffff);
 	nv_wr32(priv, NV04_PTIMER_INTR_EN_0, 0x00000000);
 	nv_wr32(priv, NV04_PTIMER_TIME_1, hi);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 03/12] drm/nouveau/bar: only ioremap BAR3 if it exists
  2014-03-24  8:42 ` Alexandre Courbot
@ 2014-03-24  8:42   ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: gnurou, nouveau, linux-kernel, dri-devel, linux-tegra

Some chips that use system memory exclusively (e.g. GK20A) do not
expose 2 BAR regions. For them only BAR1 exists, and it should be used
for USERD mapping. Do not map BAR3 if its resource does not exist.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/core/subdev/bar/base.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
index bdf594116f3f..d713eeb75b13 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
@@ -110,6 +110,7 @@ nouveau_bar_create_(struct nouveau_object *parent,
 {
 	struct nouveau_device *device = nv_device(parent);
 	struct nouveau_bar *bar;
+	bool has_bar3 = nv_device_resource_len(device, 3) != 0;
 	int ret;
 
 	ret = nouveau_subdev_create_(parent, engine, oclass, 0, "BARCTL",
@@ -118,8 +119,10 @@ nouveau_bar_create_(struct nouveau_object *parent,
 	if (ret)
 		return ret;
 
-	bar->iomem = ioremap(nv_device_resource_start(device, 3),
-			     nv_device_resource_len(device, 3));
+	if (has_bar3)
+		bar->iomem = ioremap(nv_device_resource_start(device, 3),
+				     nv_device_resource_len(device, 3));
+
 	return 0;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 03/12] drm/nouveau/bar: only ioremap BAR3 if it exists
@ 2014-03-24  8:42   ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau, dri-devel, linux-tegra, linux-kernel,
	gnurou, Alexandre Courbot

Some chips that use system memory exclusively (e.g. GK20A) do not
expose 2 BAR regions. For them only BAR1 exists, and it should be used
for USERD mapping. Do not map BAR3 if its resource does not exist.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/core/subdev/bar/base.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
index bdf594116f3f..d713eeb75b13 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
@@ -110,6 +110,7 @@ nouveau_bar_create_(struct nouveau_object *parent,
 {
 	struct nouveau_device *device = nv_device(parent);
 	struct nouveau_bar *bar;
+	bool has_bar3 = nv_device_resource_len(device, 3) != 0;
 	int ret;
 
 	ret = nouveau_subdev_create_(parent, engine, oclass, 0, "BARCTL",
@@ -118,8 +119,10 @@ nouveau_bar_create_(struct nouveau_object *parent,
 	if (ret)
 		return ret;
 
-	bar->iomem = ioremap(nv_device_resource_start(device, 3),
-			     nv_device_resource_len(device, 3));
+	if (has_bar3)
+		bar->iomem = ioremap(nv_device_resource_start(device, 3),
+				     nv_device_resource_len(device, 3));
+
 	return 0;
 }
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 04/12] drm/nouveau/bar/nvc0: support chips without BAR3
  2014-03-24  8:42 ` Alexandre Courbot
@ 2014-03-24  8:42   ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: gnurou, nouveau, linux-kernel, dri-devel, linux-tegra

Adapt the NVC0 BAR driver to make it able to support chips that do not
expose a BAR3. When this happens, BAR1 is then used for USERD mapping
and the BAR alloc() functions is disabled, making GPU objects unable
to rely on BAR for data access and falling back to PRAMIN.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c | 101 +++++++++++++------------
 1 file changed, 52 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
index 3f30db62e656..5da1b9447af0 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
@@ -79,87 +79,88 @@ nvc0_bar_unmap(struct nouveau_bar *bar, struct nouveau_vma *vma)
 }
 
 static int
-nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
-	      struct nouveau_oclass *oclass, void *data, u32 size,
-	      struct nouveau_object **pobject)
+nvc0_bar_init_vm(struct nvc0_bar_priv *priv, int nr, int bar)
 {
-	struct nouveau_device *device = nv_device(parent);
-	struct nvc0_bar_priv *priv;
+	struct nouveau_device *device = nv_device(&priv->base);
 	struct nouveau_gpuobj *mem;
 	struct nouveau_vm *vm;
+	resource_size_t bar_len;
 	int ret;
 
-	ret = nouveau_bar_create(parent, engine, oclass, &priv);
-	*pobject = nv_object(priv);
-	if (ret)
-		return ret;
-
-	/* BAR3 */
 	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0,
-				&priv->bar[0].mem);
-	mem = priv->bar[0].mem;
+				&priv->bar[nr].mem);
+	mem = priv->bar[nr].mem;
 	if (ret)
 		return ret;
 
 	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0,
-				&priv->bar[0].pgd);
+				&priv->bar[nr].pgd);
 	if (ret)
 		return ret;
 
-	ret = nouveau_vm_new(device, 0, nv_device_resource_len(device, 3), 0, &vm);
+	bar_len = nv_device_resource_len(device, bar);
+
+	ret = nouveau_vm_new(device, 0, bar_len, 0, &vm);
 	if (ret)
 		return ret;
 
 	atomic_inc(&vm->engref[NVDEV_SUBDEV_BAR]);
 
-	ret = nouveau_gpuobj_new(nv_object(priv), NULL,
-				 (nv_device_resource_len(device, 3) >> 12) * 8,
-				 0x1000, NVOBJ_FLAG_ZERO_ALLOC,
-				 &vm->pgt[0].obj[0]);
-	vm->pgt[0].refcount[0] = 1;
-	if (ret)
-		return ret;
+	/*
+	 * Bootstrap page table lookup.
+	 */
+	if (bar == 3) {
+		ret = nouveau_gpuobj_new(nv_object(priv), NULL,
+					 (bar_len >> 12) * 8, 0x1000,
+					 NVOBJ_FLAG_ZERO_ALLOC,
+					&vm->pgt[0].obj[0]);
+		vm->pgt[0].refcount[0] = 1;
+		if (ret)
+			return ret;
+	}
 
-	ret = nouveau_vm_ref(vm, &priv->bar[0].vm, priv->bar[0].pgd);
+	ret = nouveau_vm_ref(vm, &priv->bar[nr].vm, priv->bar[nr].pgd);
 	nouveau_vm_ref(NULL, &vm, NULL);
 	if (ret)
 		return ret;
 
-	nv_wo32(mem, 0x0200, lower_32_bits(priv->bar[0].pgd->addr));
-	nv_wo32(mem, 0x0204, upper_32_bits(priv->bar[0].pgd->addr));
-	nv_wo32(mem, 0x0208, lower_32_bits(nv_device_resource_len(device, 3) - 1));
-	nv_wo32(mem, 0x020c, upper_32_bits(nv_device_resource_len(device, 3) - 1));
+	nv_wo32(mem, 0x0200, lower_32_bits(priv->bar[nr].pgd->addr));
+	nv_wo32(mem, 0x0204, upper_32_bits(priv->bar[nr].pgd->addr));
+	nv_wo32(mem, 0x0208, lower_32_bits(bar_len - 1));
+	nv_wo32(mem, 0x020c, upper_32_bits(bar_len - 1));
 
-	/* BAR1 */
-	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0,
-				&priv->bar[1].mem);
-	mem = priv->bar[1].mem;
-	if (ret)
-		return ret;
+	return 0;
+}
 
-	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0,
-				&priv->bar[1].pgd);
-	if (ret)
-		return ret;
+static int
+nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
+	      struct nouveau_oclass *oclass, void *data, u32 size,
+	      struct nouveau_object **pobject)
+{
+	struct nouveau_device *device = nv_device(parent);
+	struct nvc0_bar_priv *priv;
+	bool has_bar3 = nv_device_resource_len(device, 3) != 0;
+	int ret;
 
-	ret = nouveau_vm_new(device, 0, nv_device_resource_len(device, 1), 0, &vm);
+	ret = nouveau_bar_create(parent, engine, oclass, &priv);
+	*pobject = nv_object(priv);
 	if (ret)
 		return ret;
 
-	atomic_inc(&vm->engref[NVDEV_SUBDEV_BAR]);
+	/* BAR3 */
+	if (has_bar3) {
+		ret = nvc0_bar_init_vm(priv, 0, 3);
+		if (ret)
+			return ret;
+		priv->base.alloc = nouveau_bar_alloc;
+		priv->base.kmap = nvc0_bar_kmap;
+	}
 
-	ret = nouveau_vm_ref(vm, &priv->bar[1].vm, priv->bar[1].pgd);
-	nouveau_vm_ref(NULL, &vm, NULL);
+	/* BAR1 */
+	ret = nvc0_bar_init_vm(priv, 1, 1);
 	if (ret)
 		return ret;
 
-	nv_wo32(mem, 0x0200, lower_32_bits(priv->bar[1].pgd->addr));
-	nv_wo32(mem, 0x0204, upper_32_bits(priv->bar[1].pgd->addr));
-	nv_wo32(mem, 0x0208, lower_32_bits(nv_device_resource_len(device, 1) - 1));
-	nv_wo32(mem, 0x020c, upper_32_bits(nv_device_resource_len(device, 1) - 1));
-
-	priv->base.alloc = nouveau_bar_alloc;
-	priv->base.kmap = nvc0_bar_kmap;
 	priv->base.umap = nvc0_bar_umap;
 	priv->base.unmap = nvc0_bar_unmap;
 	priv->base.flush = nv84_bar_flush;
@@ -201,7 +202,9 @@ nvc0_bar_init(struct nouveau_object *object)
 	nv_mask(priv, 0x100c80, 0x00000001, 0x00000000);
 
 	nv_wr32(priv, 0x001704, 0x80000000 | priv->bar[1].mem->addr >> 12);
-	nv_wr32(priv, 0x001714, 0xc0000000 | priv->bar[0].mem->addr >> 12);
+	if (priv->bar[0].mem)
+		nv_wr32(priv, 0x001714,
+			0xc0000000 | priv->bar[0].mem->addr >> 12);
 	return 0;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 04/12] drm/nouveau/bar/nvc0: support chips without BAR3
@ 2014-03-24  8:42   ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau, dri-devel, linux-tegra, linux-kernel,
	gnurou, Alexandre Courbot

Adapt the NVC0 BAR driver to make it able to support chips that do not
expose a BAR3. When this happens, BAR1 is then used for USERD mapping
and the BAR alloc() functions is disabled, making GPU objects unable
to rely on BAR for data access and falling back to PRAMIN.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c | 101 +++++++++++++------------
 1 file changed, 52 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
index 3f30db62e656..5da1b9447af0 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
@@ -79,87 +79,88 @@ nvc0_bar_unmap(struct nouveau_bar *bar, struct nouveau_vma *vma)
 }
 
 static int
-nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
-	      struct nouveau_oclass *oclass, void *data, u32 size,
-	      struct nouveau_object **pobject)
+nvc0_bar_init_vm(struct nvc0_bar_priv *priv, int nr, int bar)
 {
-	struct nouveau_device *device = nv_device(parent);
-	struct nvc0_bar_priv *priv;
+	struct nouveau_device *device = nv_device(&priv->base);
 	struct nouveau_gpuobj *mem;
 	struct nouveau_vm *vm;
+	resource_size_t bar_len;
 	int ret;
 
-	ret = nouveau_bar_create(parent, engine, oclass, &priv);
-	*pobject = nv_object(priv);
-	if (ret)
-		return ret;
-
-	/* BAR3 */
 	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0,
-				&priv->bar[0].mem);
-	mem = priv->bar[0].mem;
+				&priv->bar[nr].mem);
+	mem = priv->bar[nr].mem;
 	if (ret)
 		return ret;
 
 	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0,
-				&priv->bar[0].pgd);
+				&priv->bar[nr].pgd);
 	if (ret)
 		return ret;
 
-	ret = nouveau_vm_new(device, 0, nv_device_resource_len(device, 3), 0, &vm);
+	bar_len = nv_device_resource_len(device, bar);
+
+	ret = nouveau_vm_new(device, 0, bar_len, 0, &vm);
 	if (ret)
 		return ret;
 
 	atomic_inc(&vm->engref[NVDEV_SUBDEV_BAR]);
 
-	ret = nouveau_gpuobj_new(nv_object(priv), NULL,
-				 (nv_device_resource_len(device, 3) >> 12) * 8,
-				 0x1000, NVOBJ_FLAG_ZERO_ALLOC,
-				 &vm->pgt[0].obj[0]);
-	vm->pgt[0].refcount[0] = 1;
-	if (ret)
-		return ret;
+	/*
+	 * Bootstrap page table lookup.
+	 */
+	if (bar == 3) {
+		ret = nouveau_gpuobj_new(nv_object(priv), NULL,
+					 (bar_len >> 12) * 8, 0x1000,
+					 NVOBJ_FLAG_ZERO_ALLOC,
+					&vm->pgt[0].obj[0]);
+		vm->pgt[0].refcount[0] = 1;
+		if (ret)
+			return ret;
+	}
 
-	ret = nouveau_vm_ref(vm, &priv->bar[0].vm, priv->bar[0].pgd);
+	ret = nouveau_vm_ref(vm, &priv->bar[nr].vm, priv->bar[nr].pgd);
 	nouveau_vm_ref(NULL, &vm, NULL);
 	if (ret)
 		return ret;
 
-	nv_wo32(mem, 0x0200, lower_32_bits(priv->bar[0].pgd->addr));
-	nv_wo32(mem, 0x0204, upper_32_bits(priv->bar[0].pgd->addr));
-	nv_wo32(mem, 0x0208, lower_32_bits(nv_device_resource_len(device, 3) - 1));
-	nv_wo32(mem, 0x020c, upper_32_bits(nv_device_resource_len(device, 3) - 1));
+	nv_wo32(mem, 0x0200, lower_32_bits(priv->bar[nr].pgd->addr));
+	nv_wo32(mem, 0x0204, upper_32_bits(priv->bar[nr].pgd->addr));
+	nv_wo32(mem, 0x0208, lower_32_bits(bar_len - 1));
+	nv_wo32(mem, 0x020c, upper_32_bits(bar_len - 1));
 
-	/* BAR1 */
-	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0,
-				&priv->bar[1].mem);
-	mem = priv->bar[1].mem;
-	if (ret)
-		return ret;
+	return 0;
+}
 
-	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0,
-				&priv->bar[1].pgd);
-	if (ret)
-		return ret;
+static int
+nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
+	      struct nouveau_oclass *oclass, void *data, u32 size,
+	      struct nouveau_object **pobject)
+{
+	struct nouveau_device *device = nv_device(parent);
+	struct nvc0_bar_priv *priv;
+	bool has_bar3 = nv_device_resource_len(device, 3) != 0;
+	int ret;
 
-	ret = nouveau_vm_new(device, 0, nv_device_resource_len(device, 1), 0, &vm);
+	ret = nouveau_bar_create(parent, engine, oclass, &priv);
+	*pobject = nv_object(priv);
 	if (ret)
 		return ret;
 
-	atomic_inc(&vm->engref[NVDEV_SUBDEV_BAR]);
+	/* BAR3 */
+	if (has_bar3) {
+		ret = nvc0_bar_init_vm(priv, 0, 3);
+		if (ret)
+			return ret;
+		priv->base.alloc = nouveau_bar_alloc;
+		priv->base.kmap = nvc0_bar_kmap;
+	}
 
-	ret = nouveau_vm_ref(vm, &priv->bar[1].vm, priv->bar[1].pgd);
-	nouveau_vm_ref(NULL, &vm, NULL);
+	/* BAR1 */
+	ret = nvc0_bar_init_vm(priv, 1, 1);
 	if (ret)
 		return ret;
 
-	nv_wo32(mem, 0x0200, lower_32_bits(priv->bar[1].pgd->addr));
-	nv_wo32(mem, 0x0204, upper_32_bits(priv->bar[1].pgd->addr));
-	nv_wo32(mem, 0x0208, lower_32_bits(nv_device_resource_len(device, 1) - 1));
-	nv_wo32(mem, 0x020c, upper_32_bits(nv_device_resource_len(device, 1) - 1));
-
-	priv->base.alloc = nouveau_bar_alloc;
-	priv->base.kmap = nvc0_bar_kmap;
 	priv->base.umap = nvc0_bar_umap;
 	priv->base.unmap = nvc0_bar_unmap;
 	priv->base.flush = nv84_bar_flush;
@@ -201,7 +202,9 @@ nvc0_bar_init(struct nouveau_object *object)
 	nv_mask(priv, 0x100c80, 0x00000001, 0x00000000);
 
 	nv_wr32(priv, 0x001704, 0x80000000 | priv->bar[1].mem->addr >> 12);
-	nv_wr32(priv, 0x001714, 0xc0000000 | priv->bar[0].mem->addr >> 12);
+	if (priv->bar[0].mem)
+		nv_wr32(priv, 0x001714,
+			0xc0000000 | priv->bar[0].mem->addr >> 12);
 	return 0;
 }
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 05/12] drm/nouveau/fifo: add GK20A support
  2014-03-24  8:42 ` Alexandre Courbot
@ 2014-03-24  8:42   ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: gnurou, nouveau, linux-kernel, dri-devel, linux-tegra

GK20A's FIFO is compatible with NVE0, but only features 128 channels and
1 runlist.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/Makefile                   |  1 +
 drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h    |  1 +
 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c    | 35 ++++++++++++++++++++++
 drivers/gpu/drm/nouveau/core/include/engine/fifo.h |  1 +
 4 files changed, 38 insertions(+)
 create mode 100644 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c

diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
index d310c195bdfe..a90087bbdf88 100644
--- a/drivers/gpu/drm/nouveau/Makefile
+++ b/drivers/gpu/drm/nouveau/Makefile
@@ -237,6 +237,7 @@ nouveau-y += core/engine/fifo/nv50.o
 nouveau-y += core/engine/fifo/nv84.o
 nouveau-y += core/engine/fifo/nvc0.o
 nouveau-y += core/engine/fifo/nve0.o
+nouveau-y += core/engine/fifo/nvea.o
 nouveau-y += core/engine/fifo/nv108.o
 nouveau-y += core/engine/graph/ctxnv40.o
 nouveau-y += core/engine/graph/ctxnv50.o
diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h b/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h
index 014344ebee66..e96b32bb1bbc 100644
--- a/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h
+++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h
@@ -8,6 +8,7 @@ int  nve0_fifo_ctor(struct nouveau_object *, struct nouveau_object *,
 		    struct nouveau_object **);
 void nve0_fifo_dtor(struct nouveau_object *);
 int  nve0_fifo_init(struct nouveau_object *);
+int  nve0_fifo_fini(struct nouveau_object *, bool);
 
 struct nve0_fifo_impl {
 	struct nouveau_oclass base;
diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c b/drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c
new file mode 100644
index 000000000000..5e3a9514df5d
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c
@@ -0,0 +1,35 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "nve0.h"
+
+struct nouveau_oclass *
+nvea_fifo_oclass = &(struct nve0_fifo_impl) {
+	.base.handle = NV_ENGINE(FIFO, 0xea),
+	.base.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nve0_fifo_ctor,
+		.dtor = nve0_fifo_dtor,
+		.init = nve0_fifo_init,
+		.fini = nve0_fifo_fini,
+	},
+	.channels = 128,
+}.base;
diff --git a/drivers/gpu/drm/nouveau/core/include/engine/fifo.h b/drivers/gpu/drm/nouveau/core/include/engine/fifo.h
index 26b6b2bb1112..823356f45137 100644
--- a/drivers/gpu/drm/nouveau/core/include/engine/fifo.h
+++ b/drivers/gpu/drm/nouveau/core/include/engine/fifo.h
@@ -109,6 +109,7 @@ extern struct nouveau_oclass *nv50_fifo_oclass;
 extern struct nouveau_oclass *nv84_fifo_oclass;
 extern struct nouveau_oclass *nvc0_fifo_oclass;
 extern struct nouveau_oclass *nve0_fifo_oclass;
+extern struct nouveau_oclass *nvea_fifo_oclass;
 extern struct nouveau_oclass *nv108_fifo_oclass;
 
 void nv04_fifo_intr(struct nouveau_subdev *);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 05/12] drm/nouveau/fifo: add GK20A support
@ 2014-03-24  8:42   ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau, dri-devel, linux-tegra, linux-kernel,
	gnurou, Alexandre Courbot

GK20A's FIFO is compatible with NVE0, but only features 128 channels and
1 runlist.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/Makefile                   |  1 +
 drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h    |  1 +
 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c    | 35 ++++++++++++++++++++++
 drivers/gpu/drm/nouveau/core/include/engine/fifo.h |  1 +
 4 files changed, 38 insertions(+)
 create mode 100644 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c

diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
index d310c195bdfe..a90087bbdf88 100644
--- a/drivers/gpu/drm/nouveau/Makefile
+++ b/drivers/gpu/drm/nouveau/Makefile
@@ -237,6 +237,7 @@ nouveau-y += core/engine/fifo/nv50.o
 nouveau-y += core/engine/fifo/nv84.o
 nouveau-y += core/engine/fifo/nvc0.o
 nouveau-y += core/engine/fifo/nve0.o
+nouveau-y += core/engine/fifo/nvea.o
 nouveau-y += core/engine/fifo/nv108.o
 nouveau-y += core/engine/graph/ctxnv40.o
 nouveau-y += core/engine/graph/ctxnv50.o
diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h b/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h
index 014344ebee66..e96b32bb1bbc 100644
--- a/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h
+++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h
@@ -8,6 +8,7 @@ int  nve0_fifo_ctor(struct nouveau_object *, struct nouveau_object *,
 		    struct nouveau_object **);
 void nve0_fifo_dtor(struct nouveau_object *);
 int  nve0_fifo_init(struct nouveau_object *);
+int  nve0_fifo_fini(struct nouveau_object *, bool);
 
 struct nve0_fifo_impl {
 	struct nouveau_oclass base;
diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c b/drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c
new file mode 100644
index 000000000000..5e3a9514df5d
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c
@@ -0,0 +1,35 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "nve0.h"
+
+struct nouveau_oclass *
+nvea_fifo_oclass = &(struct nve0_fifo_impl) {
+	.base.handle = NV_ENGINE(FIFO, 0xea),
+	.base.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nve0_fifo_ctor,
+		.dtor = nve0_fifo_dtor,
+		.init = nve0_fifo_init,
+		.fini = nve0_fifo_fini,
+	},
+	.channels = 128,
+}.base;
diff --git a/drivers/gpu/drm/nouveau/core/include/engine/fifo.h b/drivers/gpu/drm/nouveau/core/include/engine/fifo.h
index 26b6b2bb1112..823356f45137 100644
--- a/drivers/gpu/drm/nouveau/core/include/engine/fifo.h
+++ b/drivers/gpu/drm/nouveau/core/include/engine/fifo.h
@@ -109,6 +109,7 @@ extern struct nouveau_oclass *nv50_fifo_oclass;
 extern struct nouveau_oclass *nv84_fifo_oclass;
 extern struct nouveau_oclass *nvc0_fifo_oclass;
 extern struct nouveau_oclass *nve0_fifo_oclass;
+extern struct nouveau_oclass *nvea_fifo_oclass;
 extern struct nouveau_oclass *nv108_fifo_oclass;
 
 void nv04_fifo_intr(struct nouveau_subdev *);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 06/12] drm/nouveau/ibus: add GK20A support
  2014-03-24  8:42 ` Alexandre Courbot
@ 2014-03-24  8:42     ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

Add support for initializing the priv ring of GK20A. This is done by the
BIOS on desktop GPUs, but needs to be done by hand on Tegra.

Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/nouveau/Makefile                   |   1 +
 drivers/gpu/drm/nouveau/core/include/subdev/ibus.h |   1 +
 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c    | 110 +++++++++++++++++++++
 3 files changed, 112 insertions(+)
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c

diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
index a90087bbdf88..592141e62dda 100644
--- a/drivers/gpu/drm/nouveau/Makefile
+++ b/drivers/gpu/drm/nouveau/Makefile
@@ -132,6 +132,7 @@ nouveau-y += core/subdev/i2c/nv94.o
 nouveau-y += core/subdev/i2c/nvd0.o
 nouveau-y += core/subdev/ibus/nvc0.o
 nouveau-y += core/subdev/ibus/nve0.o
+nouveau-y += core/subdev/ibus/nvea.o
 nouveau-y += core/subdev/instmem/base.o
 nouveau-y += core/subdev/instmem/nv04.o
 nouveau-y += core/subdev/instmem/nv40.o
diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h b/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h
index 88814f159d89..056a42f92705 100644
--- a/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h
+++ b/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h
@@ -30,5 +30,6 @@ nouveau_ibus(void *obj)
 
 extern struct nouveau_oclass nvc0_ibus_oclass;
 extern struct nouveau_oclass nve0_ibus_oclass;
+extern struct nouveau_oclass nvea_ibus_oclass;
 
 #endif
diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
new file mode 100644
index 000000000000..151851286e99
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
@@ -0,0 +1,110 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <subdev/ibus.h>
+
+struct nvea_ibus_priv {
+	struct nouveau_ibus base;
+};
+
+static void
+nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv)
+{
+	nv_mask(priv, 0x137250, 0x3f, 0);
+
+	nv_mask(priv, 0x000200, 0x20, 0);
+	udelay(20);
+	nv_mask(priv, 0x000200, 0x20, 0x20);
+
+	nv_wr32(priv, 0x12004c, 0x4);
+	nv_wr32(priv, 0x122204, 0x2);
+	nv_rd32(priv, 0x122204);
+}
+
+static void
+nvea_ibus_intr(struct nouveau_subdev *subdev)
+{
+	struct nvea_ibus_priv *priv = (void *)subdev;
+	u32 status0 = nv_rd32(priv, 0x120058);
+	s32 retry = 100;
+	u32 command;
+
+	if (status0 & 0x7) {
+		nv_debug(priv, "resetting priv ring\n");
+		nvea_ibus_init_priv_ring(priv);
+	}
+
+	/* Acknowledge interrupt */
+	nv_mask(priv, 0x12004c, 0x2, 0x2);
+
+	while (--retry >= 0) {
+		command = nv_rd32(priv, 0x12004c) & 0x3f;
+		if (command == 0)
+			break;
+	}
+
+	if (retry < 0)
+		nv_warn(priv, "timeout waiting for ringmaster ack\n");
+}
+
+static int
+nvea_ibus_init(struct nouveau_object *object)
+{
+	struct nvea_ibus_priv *priv = (void *)object;
+	int ret;
+
+	ret = _nouveau_ibus_init(object);
+	if (ret)
+		return ret;
+
+	nvea_ibus_init_priv_ring(priv);
+
+	return 0;
+}
+
+static int
+nvea_ibus_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
+	       struct nouveau_oclass *oclass, void *data, u32 size,
+	       struct nouveau_object **pobject)
+{
+	struct nvea_ibus_priv *priv;
+	int ret;
+
+	ret = nouveau_ibus_create(parent, engine, oclass, &priv);
+	*pobject = nv_object(priv);
+	if (ret)
+		return ret;
+
+	nv_subdev(priv)->intr = nvea_ibus_intr;
+	return 0;
+}
+
+struct nouveau_oclass
+nvea_ibus_oclass = {
+	.handle = NV_SUBDEV(IBUS, 0xea),
+	.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvea_ibus_ctor,
+		.dtor = _nouveau_ibus_dtor,
+		.init = nvea_ibus_init,
+		.fini = _nouveau_ibus_fini,
+	},
+};
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 06/12] drm/nouveau/ibus: add GK20A support
@ 2014-03-24  8:42     ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau, dri-devel, linux-tegra, linux-kernel,
	gnurou, Alexandre Courbot

Add support for initializing the priv ring of GK20A. This is done by the
BIOS on desktop GPUs, but needs to be done by hand on Tegra.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/Makefile                   |   1 +
 drivers/gpu/drm/nouveau/core/include/subdev/ibus.h |   1 +
 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c    | 110 +++++++++++++++++++++
 3 files changed, 112 insertions(+)
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c

diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
index a90087bbdf88..592141e62dda 100644
--- a/drivers/gpu/drm/nouveau/Makefile
+++ b/drivers/gpu/drm/nouveau/Makefile
@@ -132,6 +132,7 @@ nouveau-y += core/subdev/i2c/nv94.o
 nouveau-y += core/subdev/i2c/nvd0.o
 nouveau-y += core/subdev/ibus/nvc0.o
 nouveau-y += core/subdev/ibus/nve0.o
+nouveau-y += core/subdev/ibus/nvea.o
 nouveau-y += core/subdev/instmem/base.o
 nouveau-y += core/subdev/instmem/nv04.o
 nouveau-y += core/subdev/instmem/nv40.o
diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h b/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h
index 88814f159d89..056a42f92705 100644
--- a/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h
+++ b/drivers/gpu/drm/nouveau/core/include/subdev/ibus.h
@@ -30,5 +30,6 @@ nouveau_ibus(void *obj)
 
 extern struct nouveau_oclass nvc0_ibus_oclass;
 extern struct nouveau_oclass nve0_ibus_oclass;
+extern struct nouveau_oclass nvea_ibus_oclass;
 
 #endif
diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
new file mode 100644
index 000000000000..151851286e99
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
@@ -0,0 +1,110 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <subdev/ibus.h>
+
+struct nvea_ibus_priv {
+	struct nouveau_ibus base;
+};
+
+static void
+nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv)
+{
+	nv_mask(priv, 0x137250, 0x3f, 0);
+
+	nv_mask(priv, 0x000200, 0x20, 0);
+	udelay(20);
+	nv_mask(priv, 0x000200, 0x20, 0x20);
+
+	nv_wr32(priv, 0x12004c, 0x4);
+	nv_wr32(priv, 0x122204, 0x2);
+	nv_rd32(priv, 0x122204);
+}
+
+static void
+nvea_ibus_intr(struct nouveau_subdev *subdev)
+{
+	struct nvea_ibus_priv *priv = (void *)subdev;
+	u32 status0 = nv_rd32(priv, 0x120058);
+	s32 retry = 100;
+	u32 command;
+
+	if (status0 & 0x7) {
+		nv_debug(priv, "resetting priv ring\n");
+		nvea_ibus_init_priv_ring(priv);
+	}
+
+	/* Acknowledge interrupt */
+	nv_mask(priv, 0x12004c, 0x2, 0x2);
+
+	while (--retry >= 0) {
+		command = nv_rd32(priv, 0x12004c) & 0x3f;
+		if (command == 0)
+			break;
+	}
+
+	if (retry < 0)
+		nv_warn(priv, "timeout waiting for ringmaster ack\n");
+}
+
+static int
+nvea_ibus_init(struct nouveau_object *object)
+{
+	struct nvea_ibus_priv *priv = (void *)object;
+	int ret;
+
+	ret = _nouveau_ibus_init(object);
+	if (ret)
+		return ret;
+
+	nvea_ibus_init_priv_ring(priv);
+
+	return 0;
+}
+
+static int
+nvea_ibus_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
+	       struct nouveau_oclass *oclass, void *data, u32 size,
+	       struct nouveau_object **pobject)
+{
+	struct nvea_ibus_priv *priv;
+	int ret;
+
+	ret = nouveau_ibus_create(parent, engine, oclass, &priv);
+	*pobject = nv_object(priv);
+	if (ret)
+		return ret;
+
+	nv_subdev(priv)->intr = nvea_ibus_intr;
+	return 0;
+}
+
+struct nouveau_oclass
+nvea_ibus_oclass = {
+	.handle = NV_SUBDEV(IBUS, 0xea),
+	.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvea_ibus_ctor,
+		.dtor = _nouveau_ibus_dtor,
+		.init = nvea_ibus_init,
+		.fini = _nouveau_ibus_fini,
+	},
+};
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 07/12] drm/nouveau/fb: add GK20A support
  2014-03-24  8:42 ` Alexandre Courbot
@ 2014-03-24  8:42     ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

Add a simple FB device for GK20A, as well as a RAM implementation based
on contiguous DMA memory allocations suitable for chips that use system
memory as video RAM.

Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/nouveau/Makefile                 |   2 +
 drivers/gpu/drm/nouveau/core/include/subdev/fb.h |   1 +
 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c    |  56 ++++++++
 drivers/gpu/drm/nouveau/core/subdev/fb/priv.h    |   1 +
 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c | 168 +++++++++++++++++++++++
 5 files changed, 228 insertions(+)
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c

diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
index 592141e62dda..708d2e33835f 100644
--- a/drivers/gpu/drm/nouveau/Makefile
+++ b/drivers/gpu/drm/nouveau/Makefile
@@ -100,6 +100,7 @@ nouveau-y += core/subdev/fb/nvaa.o
 nouveau-y += core/subdev/fb/nvaf.o
 nouveau-y += core/subdev/fb/nvc0.o
 nouveau-y += core/subdev/fb/nve0.o
+nouveau-y += core/subdev/fb/nvea.o
 nouveau-y += core/subdev/fb/ramnv04.o
 nouveau-y += core/subdev/fb/ramnv10.o
 nouveau-y += core/subdev/fb/ramnv1a.o
@@ -114,6 +115,7 @@ nouveau-y += core/subdev/fb/ramnva3.o
 nouveau-y += core/subdev/fb/ramnvaa.o
 nouveau-y += core/subdev/fb/ramnvc0.o
 nouveau-y += core/subdev/fb/ramnve0.o
+nouveau-y += core/subdev/fb/ramnvea.o
 nouveau-y += core/subdev/fb/sddr3.o
 nouveau-y += core/subdev/fb/gddr5.o
 nouveau-y += core/subdev/gpio/base.o
diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/fb.h b/drivers/gpu/drm/nouveau/core/include/subdev/fb.h
index d7ecafbae1ca..3905816755ba 100644
--- a/drivers/gpu/drm/nouveau/core/include/subdev/fb.h
+++ b/drivers/gpu/drm/nouveau/core/include/subdev/fb.h
@@ -105,6 +105,7 @@ extern struct nouveau_oclass *nvaa_fb_oclass;
 extern struct nouveau_oclass *nvaf_fb_oclass;
 extern struct nouveau_oclass *nvc0_fb_oclass;
 extern struct nouveau_oclass *nve0_fb_oclass;
+extern struct nouveau_oclass *nvea_fb_oclass;
 
 #include <subdev/bios/ramcfg.h>
 
diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c
new file mode 100644
index 000000000000..62dbec48481e
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c
@@ -0,0 +1,56 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "nvc0.h"
+
+struct nvea_fb_priv {
+	struct nouveau_fb base;
+};
+
+static int
+nvea_fb_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
+	     struct nouveau_oclass *oclass, void *data, u32 size,
+	     struct nouveau_object **pobject)
+{
+	struct nvea_fb_priv *priv;
+	int ret;
+
+	ret = nouveau_fb_create(parent, engine, oclass, &priv);
+	*pobject = nv_object(priv);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+struct nouveau_oclass *
+nvea_fb_oclass = &(struct nouveau_fb_impl) {
+	.base.handle = NV_SUBDEV(FB, 0xea),
+	.base.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvea_fb_ctor,
+		.dtor = _nouveau_fb_dtor,
+		.init = _nouveau_fb_init,
+		.fini = _nouveau_fb_fini,
+	},
+	.memtype = nvc0_fb_memtype_valid,
+	.ram = &nvea_ram_oclass,
+}.base;
diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h b/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h
index edaf95dee612..0b95a25504d3 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h
+++ b/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h
@@ -32,6 +32,7 @@ extern struct nouveau_oclass nva3_ram_oclass;
 extern struct nouveau_oclass nvaa_ram_oclass;
 extern struct nouveau_oclass nvc0_ram_oclass;
 extern struct nouveau_oclass nve0_ram_oclass;
+extern struct nouveau_oclass nvea_ram_oclass;
 
 int nouveau_sddr3_calc(struct nouveau_ram *ram);
 int nouveau_gddr5_calc(struct nouveau_ram *ram, bool nuts);
diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c
new file mode 100644
index 000000000000..4e3d757614b6
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c
@@ -0,0 +1,168 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "priv.h"
+
+#include <subdev/fb.h>
+
+#include <linux/mm.h>
+#include <linux/types.h>
+#include <linux/dma-contiguous.h>
+
+static void
+nvea_ram_put(struct nouveau_fb *pfb, struct nouveau_mem **pmem)
+{
+	struct device *dev = nv_device_base(nv_device(pfb));
+	struct nouveau_mem *mem = *pmem;
+	int i;
+
+	*pmem = NULL;
+
+	for (i = 0; i < mem->size; i++) {
+		struct page *page;
+
+		if (mem->pages[i] == 0)
+			break;
+
+		page = pfn_to_page(dma_to_pfn(dev, mem->pages[i]));
+		dma_release_from_contiguous(dev, page, 1);
+	}
+
+	kfree(mem->pages);
+	kfree(mem);
+}
+
+static int
+nvea_ram_get(struct nouveau_fb *pfb, u64 size, u32 align, u32 ncmin,
+	     u32 memtype, struct nouveau_mem **pmem)
+{
+	struct device *dev = nv_device_base(nv_device(pfb));
+	struct nouveau_mem *mem;
+	int type = memtype & 0xff;
+	dma_addr_t dma_addr;
+	int npages;
+	int order;
+	int i;
+
+	nv_debug(pfb, "%s: size: %llx align: %x, ncmin: %x\n", __func__, size,
+		 align, ncmin);
+
+	npages = size >> PAGE_SHIFT;
+	if (npages == 0)
+		npages = 1;
+
+	if (align == 0)
+		align = PAGE_SIZE;
+	align >>= PAGE_SHIFT;
+
+	/* round alignment to the next power of 2, if needed */
+	order = fls(align);
+	if ((align & (align - 1)) == 0)
+		order--;
+
+	ncmin >>= PAGE_SHIFT;
+	/*
+	 * allocate pages by chunks of "align" size, otherwise we may leave
+	 * holes in the contiguous memory area.
+	 */
+	if (ncmin == 0)
+		ncmin = npages;
+	else if (align > ncmin)
+		ncmin = align;
+
+	mem = kzalloc(sizeof(*mem), GFP_KERNEL);
+	if (!mem)
+		return -ENOMEM;
+
+	mem->size = npages;
+	mem->memtype = type;
+
+	mem->pages = kzalloc(sizeof(dma_addr_t) * npages, GFP_KERNEL);
+	if (!mem) {
+		kfree(mem);
+		return -ENOMEM;
+	}
+
+	while (npages) {
+		struct page *pages;
+		int pos = 0;
+
+		/* don't overflow in case size is not a multiple of ncmin */
+		if (ncmin > npages)
+			ncmin = npages;
+
+		pages = dma_alloc_from_contiguous(dev, ncmin, order);
+		if (!pages) {
+			nvea_ram_put(pfb, &mem);
+			return -ENOMEM;
+		}
+
+		dma_addr = pfn_to_dma(nv_device_base(nv_device(pfb)),
+				      page_to_pfn(pages));
+
+		nv_debug(pfb, "  alloc count: %x, order: %x, addr: %x\n", ncmin,
+			 order, dma_addr);
+
+		for (i = 0; i < ncmin; i++)
+			mem->pages[pos + i] = dma_addr + (PAGE_SIZE * i);
+
+		pos += ncmin;
+		npages -= ncmin;
+	}
+
+	mem->offset = (u64)mem->pages[0];
+
+	*pmem = mem;
+
+	return 0;
+}
+
+static int
+nvea_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
+	      struct nouveau_oclass *oclass, void *data, u32 datasize,
+	      struct nouveau_object **pobject)
+{
+	struct nouveau_ram *ram;
+	int ret;
+
+	ret = nouveau_ram_create(parent, engine, oclass, &ram);
+	*pobject = nv_object(ram);
+	if (ret)
+		return ret;
+	ram->type = NV_MEM_TYPE_STOLEN;
+	ram->size = get_num_physpages() << PAGE_SHIFT;
+
+	ram->get = nvea_ram_get;
+	ram->put = nvea_ram_put;
+
+	return 0;
+}
+
+struct nouveau_oclass
+nvea_ram_oclass = {
+	.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvea_ram_ctor,
+		.dtor = _nouveau_ram_dtor,
+		.init = _nouveau_ram_init,
+		.fini = _nouveau_ram_fini,
+	},
+};
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 07/12] drm/nouveau/fb: add GK20A support
@ 2014-03-24  8:42     ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau, dri-devel, linux-tegra, linux-kernel,
	gnurou, Alexandre Courbot

Add a simple FB device for GK20A, as well as a RAM implementation based
on contiguous DMA memory allocations suitable for chips that use system
memory as video RAM.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/Makefile                 |   2 +
 drivers/gpu/drm/nouveau/core/include/subdev/fb.h |   1 +
 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c    |  56 ++++++++
 drivers/gpu/drm/nouveau/core/subdev/fb/priv.h    |   1 +
 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c | 168 +++++++++++++++++++++++
 5 files changed, 228 insertions(+)
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c
 create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c

diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
index 592141e62dda..708d2e33835f 100644
--- a/drivers/gpu/drm/nouveau/Makefile
+++ b/drivers/gpu/drm/nouveau/Makefile
@@ -100,6 +100,7 @@ nouveau-y += core/subdev/fb/nvaa.o
 nouveau-y += core/subdev/fb/nvaf.o
 nouveau-y += core/subdev/fb/nvc0.o
 nouveau-y += core/subdev/fb/nve0.o
+nouveau-y += core/subdev/fb/nvea.o
 nouveau-y += core/subdev/fb/ramnv04.o
 nouveau-y += core/subdev/fb/ramnv10.o
 nouveau-y += core/subdev/fb/ramnv1a.o
@@ -114,6 +115,7 @@ nouveau-y += core/subdev/fb/ramnva3.o
 nouveau-y += core/subdev/fb/ramnvaa.o
 nouveau-y += core/subdev/fb/ramnvc0.o
 nouveau-y += core/subdev/fb/ramnve0.o
+nouveau-y += core/subdev/fb/ramnvea.o
 nouveau-y += core/subdev/fb/sddr3.o
 nouveau-y += core/subdev/fb/gddr5.o
 nouveau-y += core/subdev/gpio/base.o
diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/fb.h b/drivers/gpu/drm/nouveau/core/include/subdev/fb.h
index d7ecafbae1ca..3905816755ba 100644
--- a/drivers/gpu/drm/nouveau/core/include/subdev/fb.h
+++ b/drivers/gpu/drm/nouveau/core/include/subdev/fb.h
@@ -105,6 +105,7 @@ extern struct nouveau_oclass *nvaa_fb_oclass;
 extern struct nouveau_oclass *nvaf_fb_oclass;
 extern struct nouveau_oclass *nvc0_fb_oclass;
 extern struct nouveau_oclass *nve0_fb_oclass;
+extern struct nouveau_oclass *nvea_fb_oclass;
 
 #include <subdev/bios/ramcfg.h>
 
diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c
new file mode 100644
index 000000000000..62dbec48481e
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/subdev/fb/nvea.c
@@ -0,0 +1,56 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "nvc0.h"
+
+struct nvea_fb_priv {
+	struct nouveau_fb base;
+};
+
+static int
+nvea_fb_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
+	     struct nouveau_oclass *oclass, void *data, u32 size,
+	     struct nouveau_object **pobject)
+{
+	struct nvea_fb_priv *priv;
+	int ret;
+
+	ret = nouveau_fb_create(parent, engine, oclass, &priv);
+	*pobject = nv_object(priv);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+struct nouveau_oclass *
+nvea_fb_oclass = &(struct nouveau_fb_impl) {
+	.base.handle = NV_SUBDEV(FB, 0xea),
+	.base.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvea_fb_ctor,
+		.dtor = _nouveau_fb_dtor,
+		.init = _nouveau_fb_init,
+		.fini = _nouveau_fb_fini,
+	},
+	.memtype = nvc0_fb_memtype_valid,
+	.ram = &nvea_ram_oclass,
+}.base;
diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h b/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h
index edaf95dee612..0b95a25504d3 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h
+++ b/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h
@@ -32,6 +32,7 @@ extern struct nouveau_oclass nva3_ram_oclass;
 extern struct nouveau_oclass nvaa_ram_oclass;
 extern struct nouveau_oclass nvc0_ram_oclass;
 extern struct nouveau_oclass nve0_ram_oclass;
+extern struct nouveau_oclass nvea_ram_oclass;
 
 int nouveau_sddr3_calc(struct nouveau_ram *ram);
 int nouveau_gddr5_calc(struct nouveau_ram *ram, bool nuts);
diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c
new file mode 100644
index 000000000000..4e3d757614b6
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnvea.c
@@ -0,0 +1,168 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "priv.h"
+
+#include <subdev/fb.h>
+
+#include <linux/mm.h>
+#include <linux/types.h>
+#include <linux/dma-contiguous.h>
+
+static void
+nvea_ram_put(struct nouveau_fb *pfb, struct nouveau_mem **pmem)
+{
+	struct device *dev = nv_device_base(nv_device(pfb));
+	struct nouveau_mem *mem = *pmem;
+	int i;
+
+	*pmem = NULL;
+
+	for (i = 0; i < mem->size; i++) {
+		struct page *page;
+
+		if (mem->pages[i] == 0)
+			break;
+
+		page = pfn_to_page(dma_to_pfn(dev, mem->pages[i]));
+		dma_release_from_contiguous(dev, page, 1);
+	}
+
+	kfree(mem->pages);
+	kfree(mem);
+}
+
+static int
+nvea_ram_get(struct nouveau_fb *pfb, u64 size, u32 align, u32 ncmin,
+	     u32 memtype, struct nouveau_mem **pmem)
+{
+	struct device *dev = nv_device_base(nv_device(pfb));
+	struct nouveau_mem *mem;
+	int type = memtype & 0xff;
+	dma_addr_t dma_addr;
+	int npages;
+	int order;
+	int i;
+
+	nv_debug(pfb, "%s: size: %llx align: %x, ncmin: %x\n", __func__, size,
+		 align, ncmin);
+
+	npages = size >> PAGE_SHIFT;
+	if (npages == 0)
+		npages = 1;
+
+	if (align == 0)
+		align = PAGE_SIZE;
+	align >>= PAGE_SHIFT;
+
+	/* round alignment to the next power of 2, if needed */
+	order = fls(align);
+	if ((align & (align - 1)) == 0)
+		order--;
+
+	ncmin >>= PAGE_SHIFT;
+	/*
+	 * allocate pages by chunks of "align" size, otherwise we may leave
+	 * holes in the contiguous memory area.
+	 */
+	if (ncmin == 0)
+		ncmin = npages;
+	else if (align > ncmin)
+		ncmin = align;
+
+	mem = kzalloc(sizeof(*mem), GFP_KERNEL);
+	if (!mem)
+		return -ENOMEM;
+
+	mem->size = npages;
+	mem->memtype = type;
+
+	mem->pages = kzalloc(sizeof(dma_addr_t) * npages, GFP_KERNEL);
+	if (!mem) {
+		kfree(mem);
+		return -ENOMEM;
+	}
+
+	while (npages) {
+		struct page *pages;
+		int pos = 0;
+
+		/* don't overflow in case size is not a multiple of ncmin */
+		if (ncmin > npages)
+			ncmin = npages;
+
+		pages = dma_alloc_from_contiguous(dev, ncmin, order);
+		if (!pages) {
+			nvea_ram_put(pfb, &mem);
+			return -ENOMEM;
+		}
+
+		dma_addr = pfn_to_dma(nv_device_base(nv_device(pfb)),
+				      page_to_pfn(pages));
+
+		nv_debug(pfb, "  alloc count: %x, order: %x, addr: %x\n", ncmin,
+			 order, dma_addr);
+
+		for (i = 0; i < ncmin; i++)
+			mem->pages[pos + i] = dma_addr + (PAGE_SIZE * i);
+
+		pos += ncmin;
+		npages -= ncmin;
+	}
+
+	mem->offset = (u64)mem->pages[0];
+
+	*pmem = mem;
+
+	return 0;
+}
+
+static int
+nvea_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
+	      struct nouveau_oclass *oclass, void *data, u32 datasize,
+	      struct nouveau_object **pobject)
+{
+	struct nouveau_ram *ram;
+	int ret;
+
+	ret = nouveau_ram_create(parent, engine, oclass, &ram);
+	*pobject = nv_object(ram);
+	if (ret)
+		return ret;
+	ram->type = NV_MEM_TYPE_STOLEN;
+	ram->size = get_num_physpages() << PAGE_SHIFT;
+
+	ram->get = nvea_ram_get;
+	ram->put = nvea_ram_put;
+
+	return 0;
+}
+
+struct nouveau_oclass
+nvea_ram_oclass = {
+	.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvea_ram_ctor,
+		.dtor = _nouveau_ram_dtor,
+		.init = _nouveau_ram_init,
+		.fini = _nouveau_ram_fini,
+	},
+};
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 08/12] drm/nouveau/graph: enable when using external firmware
  2014-03-24  8:42 ` Alexandre Courbot
@ 2014-03-24  8:42     ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

nvc0_graph_ctor() would only let the graphics engine be enabled if its
oclass has a proper microcode linked to it. This prevents GR from being
enabled at all on chips that rely exclusively on external firmware, even
though such a use-case is valid.

Relax the conditions enabling the GR engine to also include the case
where an external firmware has also been loaded.

Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
index 6ef8bf181b2d..f997a18f5760 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
@@ -1133,10 +1133,14 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
 	struct nvc0_graph_oclass *oclass = (void *)bclass;
 	struct nouveau_device *device = nv_device(parent);
 	struct nvc0_graph_priv *priv;
+	bool use_fw;
 	int ret, i;
 
+	use_fw = nouveau_boolopt(device->cfgopt, "NvGrUseFW", false);
+
 	ret = nouveau_graph_create(parent, engine, bclass,
-				   (oclass->fecs.ucode != NULL), &priv);
+				   (oclass->fecs.ucode != NULL) || use_fw,
+				   &priv);
 	*pobject = nv_object(priv);
 	if (ret)
 		return ret;
@@ -1146,7 +1150,7 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
 
 	priv->base.units = nvc0_graph_units;
 
-	if (nouveau_boolopt(device->cfgopt, "NvGrUseFW", false)) {
+	if (use_fw) {
 		nv_info(priv, "using external firmware\n");
 		if (nvc0_graph_ctor_fw(priv, "fuc409c", &priv->fuc409c) ||
 		    nvc0_graph_ctor_fw(priv, "fuc409d", &priv->fuc409d) ||
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 08/12] drm/nouveau/graph: enable when using external firmware
@ 2014-03-24  8:42     ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau, dri-devel, linux-tegra, linux-kernel,
	gnurou, Alexandre Courbot

nvc0_graph_ctor() would only let the graphics engine be enabled if its
oclass has a proper microcode linked to it. This prevents GR from being
enabled at all on chips that rely exclusively on external firmware, even
though such a use-case is valid.

Relax the conditions enabling the GR engine to also include the case
where an external firmware has also been loaded.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
index 6ef8bf181b2d..f997a18f5760 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
@@ -1133,10 +1133,14 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
 	struct nvc0_graph_oclass *oclass = (void *)bclass;
 	struct nouveau_device *device = nv_device(parent);
 	struct nvc0_graph_priv *priv;
+	bool use_fw;
 	int ret, i;
 
+	use_fw = nouveau_boolopt(device->cfgopt, "NvGrUseFW", false);
+
 	ret = nouveau_graph_create(parent, engine, bclass,
-				   (oclass->fecs.ucode != NULL), &priv);
+				   (oclass->fecs.ucode != NULL) || use_fw,
+				   &priv);
 	*pobject = nv_object(priv);
 	if (ret)
 		return ret;
@@ -1146,7 +1150,7 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
 
 	priv->base.units = nvc0_graph_units;
 
-	if (nouveau_boolopt(device->cfgopt, "NvGrUseFW", false)) {
+	if (use_fw) {
 		nv_info(priv, "using external firmware\n");
 		if (nvc0_graph_ctor_fw(priv, "fuc409c", &priv->fuc409c) ||
 		    nvc0_graph_ctor_fw(priv, "fuc409d", &priv->fuc409d) ||
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time
  2014-03-24  8:42 ` Alexandre Courbot
@ 2014-03-24  8:42     ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

Pad the microcode to a multiple of 0x40, otherwise firmware will fail to
run from non-prepadded firmware files.

Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
index f997a18f5760..367e72daf8b1 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
@@ -768,6 +768,10 @@ nvc0_graph_init_fw(struct nvc0_graph_priv *priv, u32 fuc_base,
 			nv_wr32(priv, fuc_base + 0x0188, i >> 6);
 		nv_wr32(priv, fuc_base + 0x0184, code->data[i]);
 	}
+
+	/* code must be padded to 0x40 */
+	for (; i < (((code->size / 4) + 0x3f) & ~0x3f); i++)
+		nv_wr32(priv, fuc_base + 0x0184, 0);
 }
 
 static void
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time
@ 2014-03-24  8:42     ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau, dri-devel, linux-tegra, linux-kernel,
	gnurou, Alexandre Courbot

Pad the microcode to a multiple of 0x40, otherwise firmware will fail to
run from non-prepadded firmware files.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
index f997a18f5760..367e72daf8b1 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
@@ -768,6 +768,10 @@ nvc0_graph_init_fw(struct nvc0_graph_priv *priv, u32 fuc_base,
 			nv_wr32(priv, fuc_base + 0x0188, i >> 6);
 		nv_wr32(priv, fuc_base + 0x0184, code->data[i]);
 	}
+
+	/* code must be padded to 0x40 */
+	for (; i < (((code->size / 4) + 0x3f) & ~0x3f); i++)
+		nv_wr32(priv, fuc_base + 0x0184, 0);
 }
 
 static void
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 10/12] drm/nouveau/graph: add GK20A support
  2014-03-24  8:42 ` Alexandre Courbot
@ 2014-03-24  8:42     ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

Add a GR device for GK20A based on NVE4, with the correct classes
definitions (GK20A's 3D class is 0xa297).

Most of the NVE4 code can be used on GK20A, so make relevant bits of
NVE4 available to other chips as well.

Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/nouveau/Makefile                   |  1 +
 .../gpu/drm/nouveau/core/engine/graph/ctxnve4.c    |  4 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h   |  9 +++
 drivers/gpu/drm/nouveau/core/engine/graph/nve4.c   |  2 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c   | 75 ++++++++++++++++++++++
 .../gpu/drm/nouveau/core/include/engine/graph.h    |  1 +
 6 files changed, 89 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c

diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
index 708d2e33835f..032a4744f843 100644
--- a/drivers/gpu/drm/nouveau/Makefile
+++ b/drivers/gpu/drm/nouveau/Makefile
@@ -270,6 +270,7 @@ nouveau-y += core/engine/graph/nvc8.o
 nouveau-y += core/engine/graph/nvd7.o
 nouveau-y += core/engine/graph/nvd9.o
 nouveau-y += core/engine/graph/nve4.o
+nouveau-y += core/engine/graph/nvea.o
 nouveau-y += core/engine/graph/nvf0.o
 nouveau-y += core/engine/graph/nv108.o
 nouveau-y += core/engine/mpeg/nv31.o
diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
index e2de73ee5eee..3904073f860d 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
@@ -804,7 +804,7 @@ nve4_grctx_init_unk[] = {
 	{}
 };
 
-static void
+void
 nve4_grctx_generate_mods(struct nvc0_graph_priv *priv, struct nvc0_grctx *info)
 {
 	u32 magic[GPC_MAX][2];
@@ -962,7 +962,7 @@ nve4_grctx_generate_main(struct nvc0_graph_priv *priv, struct nvc0_grctx *info)
 	nv_mask(priv, 0x41be10, 0x00800000, 0x00800000);
 }
 
-static struct nvc0_graph_init *
+struct nvc0_graph_init *
 nve4_grctx_init_hub[] = {
 	nvc0_grctx_init_base,
 	nve4_grctx_init_unk40xx,
diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
index b0ab6de270b2..904f09b540c5 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
@@ -221,6 +221,8 @@ void nvc0_grctx_generate_r418bb8(struct nvc0_graph_priv *);
 void nve4_grctx_generate_r418bb8(struct nvc0_graph_priv *);
 void nvc0_grctx_generate_r406800(struct nvc0_graph_priv *);
 
+void nve4_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *);
+
 extern struct nouveau_oclass *nvc0_grctx_oclass;
 extern struct nvc0_graph_init *nvc0_grctx_init_hub[];
 extern struct nvc0_graph_init nvc0_grctx_init_base[];
@@ -237,12 +239,17 @@ extern struct nvc0_graph_init nvc0_grctx_init_gpc_1[];
 extern struct nvc0_graph_init nvc0_grctx_init_tpc[];
 extern struct nvc0_graph_init nvc0_grctx_init_icmd[];
 extern struct nvc0_graph_init nvd9_grctx_init_icmd[]; //
+extern struct nvc0_graph_init nve4_grctx_init_icmd[];
+
+extern struct nvc0_graph_init *nve4_grctx_init_hub[];
+extern struct nvc0_graph_init *nve4_grctx_init_gpc[];
 
 extern struct nvc0_graph_mthd nvc0_grctx_init_mthd[];
 extern struct nvc0_graph_init nvc0_grctx_init_902d[];
 extern struct nvc0_graph_init nvc0_grctx_init_9039[];
 extern struct nvc0_graph_init nvc0_grctx_init_90c0[];
 extern struct nvc0_graph_init nvc0_grctx_init_mthd_magic[];
+extern struct nvc0_graph_init nve4_grctx_init_a097[];
 
 void nvc1_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *);
 void nvc1_grctx_generate_unkn(struct nvc0_graph_priv *);
@@ -277,6 +284,8 @@ extern struct nvc0_graph_init nvf0_grctx_init_unk60xx[];
 
 extern struct nouveau_oclass *nv108_grctx_oclass;
 
+extern struct nvc0_graph_init *nve4_graph_init_mmio[];
+
 #define mmio_data(s,a,p) do {                                                  \
 	info->buffer[info->buffer_nr] = round_up(info->addr, (a));             \
 	info->addr = info->buffer[info->buffer_nr++] + (s);                    \
diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
index 05ec09c88517..442857c5c120 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
@@ -298,7 +298,7 @@ nve4_graph_init(struct nouveau_object *object)
 	return nvc0_graph_init_ctxctl(priv);
 }
 
-static struct nvc0_graph_init *
+struct nvc0_graph_init *
 nve4_graph_init_mmio[] = {
 	nve4_graph_init_regs,
 	nvc0_graph_init_unk40xx,
diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
new file mode 100644
index 000000000000..d5e6a1adcacb
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
@@ -0,0 +1,75 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "nvc0.h"
+
+static struct nouveau_oclass
+nvea_graph_sclass[] = {
+	{ 0x902d, &nouveau_object_ofuncs },
+	{ 0xa040, &nouveau_object_ofuncs },
+	{ 0xa297, &nouveau_object_ofuncs },
+	{ 0xa0c0, &nouveau_object_ofuncs },
+	{}
+};
+
+static struct nvc0_graph_mthd
+nvea_grctx_init_mthd[] = {
+	{ 0xa297, nve4_grctx_init_a097, },
+	{ 0x902d, nvc0_grctx_init_902d, },
+	{ 0x902d, nvc0_grctx_init_mthd_magic, },
+	{}
+};
+
+struct nouveau_oclass *
+nvea_grctx_oclass = &(struct nvc0_grctx_oclass) {
+	.base.handle = NV_ENGCTX(GR, 0xea),
+	.base.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvc0_graph_context_ctor,
+		.dtor = nvc0_graph_context_dtor,
+		.init = _nouveau_graph_context_init,
+		.fini = _nouveau_graph_context_fini,
+		.rd32 = _nouveau_graph_context_rd32,
+		.wr32 = _nouveau_graph_context_wr32,
+	},
+	.main = nve4_grctx_generate_main,
+	.mods = nve4_grctx_generate_mods,
+	.unkn = nve4_grctx_generate_unkn,
+	.hub  = nve4_grctx_init_hub,
+	.gpc  = nve4_grctx_init_gpc,
+	.icmd = nve4_grctx_init_icmd,
+	.mthd = nvea_grctx_init_mthd,
+}.base;
+
+
+struct nouveau_oclass *
+nvea_graph_oclass = &(struct nvc0_graph_oclass) {
+	.base.handle = NV_ENGINE(GR, 0xea),
+	.base.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvc0_graph_ctor,
+		.dtor = nvc0_graph_dtor,
+		.init = nve4_graph_init,
+		.fini = _nouveau_graph_fini,
+	},
+	.cclass = &nvea_grctx_oclass,
+	.sclass = nvea_graph_sclass,
+	.mmio = nve4_graph_init_mmio,
+}.base;
diff --git a/drivers/gpu/drm/nouveau/core/include/engine/graph.h b/drivers/gpu/drm/nouveau/core/include/engine/graph.h
index 97705618de97..632d7e172172 100644
--- a/drivers/gpu/drm/nouveau/core/include/engine/graph.h
+++ b/drivers/gpu/drm/nouveau/core/include/engine/graph.h
@@ -68,6 +68,7 @@ extern struct nouveau_oclass *nvc8_graph_oclass;
 extern struct nouveau_oclass *nvd7_graph_oclass;
 extern struct nouveau_oclass *nvd9_graph_oclass;
 extern struct nouveau_oclass *nve4_graph_oclass;
+extern struct nouveau_oclass *nvea_graph_oclass;
 extern struct nouveau_oclass *nvf0_graph_oclass;
 extern struct nouveau_oclass *nv108_graph_oclass;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 10/12] drm/nouveau/graph: add GK20A support
@ 2014-03-24  8:42     ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau, dri-devel, linux-tegra, linux-kernel,
	gnurou, Alexandre Courbot

Add a GR device for GK20A based on NVE4, with the correct classes
definitions (GK20A's 3D class is 0xa297).

Most of the NVE4 code can be used on GK20A, so make relevant bits of
NVE4 available to other chips as well.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/Makefile                   |  1 +
 .../gpu/drm/nouveau/core/engine/graph/ctxnve4.c    |  4 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h   |  9 +++
 drivers/gpu/drm/nouveau/core/engine/graph/nve4.c   |  2 +-
 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c   | 75 ++++++++++++++++++++++
 .../gpu/drm/nouveau/core/include/engine/graph.h    |  1 +
 6 files changed, 89 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c

diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
index 708d2e33835f..032a4744f843 100644
--- a/drivers/gpu/drm/nouveau/Makefile
+++ b/drivers/gpu/drm/nouveau/Makefile
@@ -270,6 +270,7 @@ nouveau-y += core/engine/graph/nvc8.o
 nouveau-y += core/engine/graph/nvd7.o
 nouveau-y += core/engine/graph/nvd9.o
 nouveau-y += core/engine/graph/nve4.o
+nouveau-y += core/engine/graph/nvea.o
 nouveau-y += core/engine/graph/nvf0.o
 nouveau-y += core/engine/graph/nv108.o
 nouveau-y += core/engine/mpeg/nv31.o
diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
index e2de73ee5eee..3904073f860d 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
@@ -804,7 +804,7 @@ nve4_grctx_init_unk[] = {
 	{}
 };
 
-static void
+void
 nve4_grctx_generate_mods(struct nvc0_graph_priv *priv, struct nvc0_grctx *info)
 {
 	u32 magic[GPC_MAX][2];
@@ -962,7 +962,7 @@ nve4_grctx_generate_main(struct nvc0_graph_priv *priv, struct nvc0_grctx *info)
 	nv_mask(priv, 0x41be10, 0x00800000, 0x00800000);
 }
 
-static struct nvc0_graph_init *
+struct nvc0_graph_init *
 nve4_grctx_init_hub[] = {
 	nvc0_grctx_init_base,
 	nve4_grctx_init_unk40xx,
diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
index b0ab6de270b2..904f09b540c5 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
@@ -221,6 +221,8 @@ void nvc0_grctx_generate_r418bb8(struct nvc0_graph_priv *);
 void nve4_grctx_generate_r418bb8(struct nvc0_graph_priv *);
 void nvc0_grctx_generate_r406800(struct nvc0_graph_priv *);
 
+void nve4_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *);
+
 extern struct nouveau_oclass *nvc0_grctx_oclass;
 extern struct nvc0_graph_init *nvc0_grctx_init_hub[];
 extern struct nvc0_graph_init nvc0_grctx_init_base[];
@@ -237,12 +239,17 @@ extern struct nvc0_graph_init nvc0_grctx_init_gpc_1[];
 extern struct nvc0_graph_init nvc0_grctx_init_tpc[];
 extern struct nvc0_graph_init nvc0_grctx_init_icmd[];
 extern struct nvc0_graph_init nvd9_grctx_init_icmd[]; //
+extern struct nvc0_graph_init nve4_grctx_init_icmd[];
+
+extern struct nvc0_graph_init *nve4_grctx_init_hub[];
+extern struct nvc0_graph_init *nve4_grctx_init_gpc[];
 
 extern struct nvc0_graph_mthd nvc0_grctx_init_mthd[];
 extern struct nvc0_graph_init nvc0_grctx_init_902d[];
 extern struct nvc0_graph_init nvc0_grctx_init_9039[];
 extern struct nvc0_graph_init nvc0_grctx_init_90c0[];
 extern struct nvc0_graph_init nvc0_grctx_init_mthd_magic[];
+extern struct nvc0_graph_init nve4_grctx_init_a097[];
 
 void nvc1_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *);
 void nvc1_grctx_generate_unkn(struct nvc0_graph_priv *);
@@ -277,6 +284,8 @@ extern struct nvc0_graph_init nvf0_grctx_init_unk60xx[];
 
 extern struct nouveau_oclass *nv108_grctx_oclass;
 
+extern struct nvc0_graph_init *nve4_graph_init_mmio[];
+
 #define mmio_data(s,a,p) do {                                                  \
 	info->buffer[info->buffer_nr] = round_up(info->addr, (a));             \
 	info->addr = info->buffer[info->buffer_nr++] + (s);                    \
diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
index 05ec09c88517..442857c5c120 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
@@ -298,7 +298,7 @@ nve4_graph_init(struct nouveau_object *object)
 	return nvc0_graph_init_ctxctl(priv);
 }
 
-static struct nvc0_graph_init *
+struct nvc0_graph_init *
 nve4_graph_init_mmio[] = {
 	nve4_graph_init_regs,
 	nvc0_graph_init_unk40xx,
diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
new file mode 100644
index 000000000000..d5e6a1adcacb
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
@@ -0,0 +1,75 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "nvc0.h"
+
+static struct nouveau_oclass
+nvea_graph_sclass[] = {
+	{ 0x902d, &nouveau_object_ofuncs },
+	{ 0xa040, &nouveau_object_ofuncs },
+	{ 0xa297, &nouveau_object_ofuncs },
+	{ 0xa0c0, &nouveau_object_ofuncs },
+	{}
+};
+
+static struct nvc0_graph_mthd
+nvea_grctx_init_mthd[] = {
+	{ 0xa297, nve4_grctx_init_a097, },
+	{ 0x902d, nvc0_grctx_init_902d, },
+	{ 0x902d, nvc0_grctx_init_mthd_magic, },
+	{}
+};
+
+struct nouveau_oclass *
+nvea_grctx_oclass = &(struct nvc0_grctx_oclass) {
+	.base.handle = NV_ENGCTX(GR, 0xea),
+	.base.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvc0_graph_context_ctor,
+		.dtor = nvc0_graph_context_dtor,
+		.init = _nouveau_graph_context_init,
+		.fini = _nouveau_graph_context_fini,
+		.rd32 = _nouveau_graph_context_rd32,
+		.wr32 = _nouveau_graph_context_wr32,
+	},
+	.main = nve4_grctx_generate_main,
+	.mods = nve4_grctx_generate_mods,
+	.unkn = nve4_grctx_generate_unkn,
+	.hub  = nve4_grctx_init_hub,
+	.gpc  = nve4_grctx_init_gpc,
+	.icmd = nve4_grctx_init_icmd,
+	.mthd = nvea_grctx_init_mthd,
+}.base;
+
+
+struct nouveau_oclass *
+nvea_graph_oclass = &(struct nvc0_graph_oclass) {
+	.base.handle = NV_ENGINE(GR, 0xea),
+	.base.ofuncs = &(struct nouveau_ofuncs) {
+		.ctor = nvc0_graph_ctor,
+		.dtor = nvc0_graph_dtor,
+		.init = nve4_graph_init,
+		.fini = _nouveau_graph_fini,
+	},
+	.cclass = &nvea_grctx_oclass,
+	.sclass = nvea_graph_sclass,
+	.mmio = nve4_graph_init_mmio,
+}.base;
diff --git a/drivers/gpu/drm/nouveau/core/include/engine/graph.h b/drivers/gpu/drm/nouveau/core/include/engine/graph.h
index 97705618de97..632d7e172172 100644
--- a/drivers/gpu/drm/nouveau/core/include/engine/graph.h
+++ b/drivers/gpu/drm/nouveau/core/include/engine/graph.h
@@ -68,6 +68,7 @@ extern struct nouveau_oclass *nvc8_graph_oclass;
 extern struct nouveau_oclass *nvd7_graph_oclass;
 extern struct nouveau_oclass *nvd9_graph_oclass;
 extern struct nouveau_oclass *nve4_graph_oclass;
+extern struct nouveau_oclass *nvea_graph_oclass;
 extern struct nouveau_oclass *nvf0_graph_oclass;
 extern struct nouveau_oclass *nv108_graph_oclass;
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()
  2014-03-24  8:42 ` Alexandre Courbot
@ 2014-03-24  8:42     ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

GK20A does not embed a dedicated COPY engine and thus cannot allocate
the copy channel that nouveau_accel_init() attempts to create. It also
lacks any display hardware, so the creation of a software channel does
not apply neither.

Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/nouveau/nouveau_drm.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index ef27949057c3..f2394e84eae6 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -171,6 +171,11 @@ nouveau_accel_init(struct nouveau_drm *drm)
 		return;
 	}
 
+	if (device->chipset == 0xea) {
+		/* gk20a does not have CE0/CE1 */
+		arg0 = NVE0_CHANNEL_IND_ENGINE_GR;
+		arg1 = 1;
+	} else
 	if (device->card_type >= NV_E0) {
 		ret = nouveau_channel_new(drm, &drm->client, NVDRM_DEVICE,
 					  NVDRM_CHAN + 1,
@@ -207,6 +212,10 @@ nouveau_accel_init(struct nouveau_drm *drm)
 		return;
 	}
 
+	/* Need to figure out how to handle sw for gk20a */
+	if (device->chipset == 0xea)
+		goto skip_sw_init;
+
 	ret = nouveau_object_new(nv_object(drm), NVDRM_CHAN, NVDRM_NVSW,
 				 nouveau_abi16_swclass(drm), NULL, 0, &object);
 	if (ret == 0) {
@@ -233,6 +242,7 @@ nouveau_accel_init(struct nouveau_drm *drm)
 		return;
 	}
 
+skip_sw_init:
 	if (device->card_type < NV_C0) {
 		ret = nouveau_gpuobj_new(drm->device, NULL, 32, 0, 0,
 					&drm->notify);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()
@ 2014-03-24  8:42     ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau, dri-devel, linux-tegra, linux-kernel,
	gnurou, Alexandre Courbot

GK20A does not embed a dedicated COPY engine and thus cannot allocate
the copy channel that nouveau_accel_init() attempts to create. It also
lacks any display hardware, so the creation of a software channel does
not apply neither.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_drm.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index ef27949057c3..f2394e84eae6 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -171,6 +171,11 @@ nouveau_accel_init(struct nouveau_drm *drm)
 		return;
 	}
 
+	if (device->chipset == 0xea) {
+		/* gk20a does not have CE0/CE1 */
+		arg0 = NVE0_CHANNEL_IND_ENGINE_GR;
+		arg1 = 1;
+	} else
 	if (device->card_type >= NV_E0) {
 		ret = nouveau_channel_new(drm, &drm->client, NVDRM_DEVICE,
 					  NVDRM_CHAN + 1,
@@ -207,6 +212,10 @@ nouveau_accel_init(struct nouveau_drm *drm)
 		return;
 	}
 
+	/* Need to figure out how to handle sw for gk20a */
+	if (device->chipset == 0xea)
+		goto skip_sw_init;
+
 	ret = nouveau_object_new(nv_object(drm), NVDRM_CHAN, NVDRM_NVSW,
 				 nouveau_abi16_swclass(drm), NULL, 0, &object);
 	if (ret == 0) {
@@ -233,6 +242,7 @@ nouveau_accel_init(struct nouveau_drm *drm)
 		return;
 	}
 
+skip_sw_init:
 	if (device->card_type < NV_C0) {
 		ret = nouveau_gpuobj_new(drm->device, NULL, 32, 0, 0,
 					&drm->notify);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 12/12] drm/nouveau: support for probing GK20A
  2014-03-24  8:42 ` Alexandre Courbot
@ 2014-03-24  8:42     ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

Set the correct subdev/engine classes when GK20A (0xea) is probed.

Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/nouveau/core/engine/device/nve0.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
index 987edbc30a09..8509dd57de1f 100644
--- a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
+++ b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
@@ -156,6 +156,26 @@ nve0_identify(struct nouveau_device *device)
 		device->oclass[NVDEV_ENGINE_PPP    ] = &nvc0_ppp_oclass;
 		device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass;
 		break;
+	case 0xea:
+		device->cname = "GK20A";
+		device->oclass[NVDEV_SUBDEV_MC     ] =  nvc3_mc_oclass;
+		device->oclass[NVDEV_SUBDEV_BUS    ] =  nvc0_bus_oclass;
+		device->oclass[NVDEV_SUBDEV_TIMER  ] = &nv04_timer_oclass;
+		device->oclass[NVDEV_SUBDEV_FB     ] =  nvea_fb_oclass;
+		device->oclass[NVDEV_SUBDEV_IBUS   ] = &nvea_ibus_oclass;
+		device->oclass[NVDEV_SUBDEV_INSTMEM] = nv50_instmem_oclass;
+		device->oclass[NVDEV_SUBDEV_VM     ] = &nvc0_vmmgr_oclass;
+		device->oclass[NVDEV_SUBDEV_BAR    ] = &nvc0_bar_oclass;
+		device->oclass[NVDEV_ENGINE_DMAOBJ ] = &nvd0_dmaeng_oclass;
+		device->oclass[NVDEV_ENGINE_FIFO   ] =  nvea_fifo_oclass;
+		/* TODO will need an implementation for this at some point... */
+#if 0
+		device->oclass[NVDEV_ENGINE_SW     ] =  nvc0_software_oclass;
+#endif
+		device->oclass[NVDEV_ENGINE_GR     ] =  nvea_graph_oclass;
+		device->oclass[NVDEV_ENGINE_COPY2  ] = &nve0_copy2_oclass;
+		device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass;
+		break;
 	case 0xf0:
 		device->cname = "GK110";
 		device->oclass[NVDEV_SUBDEV_VBIOS  ] = &nouveau_bios_oclass;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 12/12] drm/nouveau: support for probing GK20A
@ 2014-03-24  8:42     ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-24  8:42 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, nouveau, dri-devel, linux-tegra, linux-kernel,
	gnurou, Alexandre Courbot

Set the correct subdev/engine classes when GK20A (0xea) is probed.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/drm/nouveau/core/engine/device/nve0.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
index 987edbc30a09..8509dd57de1f 100644
--- a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
+++ b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
@@ -156,6 +156,26 @@ nve0_identify(struct nouveau_device *device)
 		device->oclass[NVDEV_ENGINE_PPP    ] = &nvc0_ppp_oclass;
 		device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass;
 		break;
+	case 0xea:
+		device->cname = "GK20A";
+		device->oclass[NVDEV_SUBDEV_MC     ] =  nvc3_mc_oclass;
+		device->oclass[NVDEV_SUBDEV_BUS    ] =  nvc0_bus_oclass;
+		device->oclass[NVDEV_SUBDEV_TIMER  ] = &nv04_timer_oclass;
+		device->oclass[NVDEV_SUBDEV_FB     ] =  nvea_fb_oclass;
+		device->oclass[NVDEV_SUBDEV_IBUS   ] = &nvea_ibus_oclass;
+		device->oclass[NVDEV_SUBDEV_INSTMEM] = nv50_instmem_oclass;
+		device->oclass[NVDEV_SUBDEV_VM     ] = &nvc0_vmmgr_oclass;
+		device->oclass[NVDEV_SUBDEV_BAR    ] = &nvc0_bar_oclass;
+		device->oclass[NVDEV_ENGINE_DMAOBJ ] = &nvd0_dmaeng_oclass;
+		device->oclass[NVDEV_ENGINE_FIFO   ] =  nvea_fifo_oclass;
+		/* TODO will need an implementation for this at some point... */
+#if 0
+		device->oclass[NVDEV_ENGINE_SW     ] =  nvc0_software_oclass;
+#endif
+		device->oclass[NVDEV_ENGINE_GR     ] =  nvea_graph_oclass;
+		device->oclass[NVDEV_ENGINE_COPY2  ] = &nve0_copy2_oclass;
+		device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass;
+		break;
 	case 0xf0:
 		device->cname = "GK110";
 		device->oclass[NVDEV_SUBDEV_VBIOS  ] = &nouveau_bios_oclass;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* Re: [PATCH 00/12] drm/nouveau: support for GK20A, cont'd
  2014-03-24  8:42 ` Alexandre Courbot
                   ` (4 preceding siblings ...)
  (?)
@ 2014-03-24 13:19 ` Lucas Stach
       [not found]   ` <1395667191.5062.18.camel-WzVe3FnzCwFR6QfukMTsflXZhhPuCNm+@public.gmane.org>
  -1 siblings, 1 reply; 100+ messages in thread
From: Lucas Stach @ 2014-03-24 13:19 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, gnurou, nouveau, linux-kernel, dri-devel, linux-tegra

Hi Alexandre,

Am Montag, den 24.03.2014, 17:42 +0900 schrieb Alexandre Courbot:
> Hi everyone,
[...]
> 
> A few lines of hacks (not included here) are still needed to deal with cached
> mappings triggering external aborts and CPU/GPU memory coherency issues, but I
> hope to understand and address these issues next.

For the coherency issue part you may want to look at my Nouveau on ARM
series. Most of it never made it upstream, as I lacked the time to work
further on this, but it solves the coherency issue from the kernel.

It does so by doing the necessary manual cache flushes/invalidates on
buffer access, so costs some performance. To avoid this you really want
to get writecombined mappings into the kernel<->userspace interface.
Simply mapping the pushbuf as WC/US has brought a 7% performance
increase in OpenArena when I last tested this. This test was done with
only one PCIe lane, so the perf increase may be even better with a more
adequate interconnect.

Regards,
Lucas
-- 
Pengutronix e.K.                           | Lucas Stach                 |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-5076 |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 01/12] drm/nouveau: fix missing newline
  2014-03-24  8:42     ` Alexandre Courbot
@ 2014-03-24 21:49         ` Thierry Reding
  -1 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 21:49 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 405 bytes --]

On Mon, Mar 24, 2014 at 05:42:23PM +0900, Alexandre Courbot wrote:
> Add a missing newline at the end of a DRM_INFO message.
> 
> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/gpu/drm/nouveau/nouveau_drm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 181 bytes --]

_______________________________________________
Nouveau mailing list
Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 01/12] drm/nouveau: fix missing newline
@ 2014-03-24 21:49         ` Thierry Reding
  0 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 21:49 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau, dri-devel, linux-tegra, linux-kernel, gnurou

[-- Attachment #1: Type: text/plain, Size: 347 bytes --]

On Mon, Mar 24, 2014 at 05:42:23PM +0900, Alexandre Courbot wrote:
> Add a missing newline at the end of a DRM_INFO message.
> 
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/nouveau_drm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
  2014-03-24  8:42     ` Alexandre Courbot
@ 2014-03-24 21:54         ` Thierry Reding
  -1 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 21:54 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 1368 bytes --]

On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
> GK20A's timer is directly attached to the system timer and cannot be
> calibrated. Skip the calibration phase on that chip since the
> corresponding registers do not exist.
> 
> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
> index c0bdd10358d7..822fe0d8a871 100644
> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>  	if (ret)
>  		return ret;
>  
> +	/* gk20a does not have the calibration registers */
> +	if (device->chipset == 0xea)
> +		goto skip_clk_init;

I'm concerned that this won't scale in the future. Perhaps a better
solution would be to add a "flags" or "features" field to struct
nouveau_device along with feature bits such as HAS_CALIBRATION or
similar.

That way we don't have to touch this code for every new future Tegra
chip. Unless perhaps if there's a reason to expect things to change in
newer generations.

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 181 bytes --]

_______________________________________________
Nouveau mailing list
Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
@ 2014-03-24 21:54         ` Thierry Reding
  0 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 21:54 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau, dri-devel, linux-tegra, linux-kernel, gnurou

[-- Attachment #1: Type: text/plain, Size: 1339 bytes --]

On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
> GK20A's timer is directly attached to the system timer and cannot be
> calibrated. Skip the calibration phase on that chip since the
> corresponding registers do not exist.
> 
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
> index c0bdd10358d7..822fe0d8a871 100644
> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>  	if (ret)
>  		return ret;
>  
> +	/* gk20a does not have the calibration registers */
> +	if (device->chipset == 0xea)
> +		goto skip_clk_init;

I'm concerned that this won't scale in the future. Perhaps a better
solution would be to add a "flags" or "features" field to struct
nouveau_device along with feature bits such as HAS_CALIBRATION or
similar.

That way we don't have to touch this code for every new future Tegra
chip. Unless perhaps if there's a reason to expect things to change in
newer generations.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 04/12] drm/nouveau/bar/nvc0: support chips without BAR3
  2014-03-24  8:42   ` Alexandre Courbot
@ 2014-03-24 22:10       ` Thierry Reding
  -1 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 22:10 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 1910 bytes --]

On Mon, Mar 24, 2014 at 05:42:26PM +0900, Alexandre Courbot wrote:
[...]
> diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
[...]
>  static int
> -nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
> -	      struct nouveau_oclass *oclass, void *data, u32 size,
> -	      struct nouveau_object **pobject)
> +nvc0_bar_init_vm(struct nvc0_bar_priv *priv, int nr, int bar)
>  {
[...]
> -	/* BAR3 */
>  	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0,
> -				&priv->bar[0].mem);
> -	mem = priv->bar[0].mem;
> +				&priv->bar[nr].mem);
> +	mem = priv->bar[nr].mem;
>  	if (ret)
>  		return ret;
>  
>  	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0,
> -				&priv->bar[0].pgd);
> +				&priv->bar[nr].pgd);
>  	if (ret)
>  		return ret;
[...]
> +static int
> +nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
> +	      struct nouveau_oclass *oclass, void *data, u32 size,
> +	      struct nouveau_object **pobject)
> +{
[...]
> +	/* BAR3 */
> +	if (has_bar3) {
> +		ret = nvc0_bar_init_vm(priv, 0, 3);
[...]
> +	/* BAR1 */
> +	ret = nvc0_bar_init_vm(priv, 1, 1);
>  	if (ret)
>  		return ret;

The calls to nvc0_bar_init_vm() are somewhat confusing in my opinion. It
is hard to see from the invocation what these numbers mean and therefore
distinguish which parameter is which.

Perhaps a slightly more readable way would be to pass in a pointer to a
structure as second parameter instead of the index into an array. So
it'd look somewhat like this:

	if (has_bar3) {
		ret = nvc0_bar_init_vm(priv, &priv->bar[0], 3);
		...
	}
	...
	ret = nvc0_bar_init_vm(priv, &priv->bar[1], 1);
	...

Unfortunately that would require a new type to be created for the bar[]
structures, so it'd be slightly more intrusive.

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 181 bytes --]

_______________________________________________
Nouveau mailing list
Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 04/12] drm/nouveau/bar/nvc0: support chips without BAR3
@ 2014-03-24 22:10       ` Thierry Reding
  0 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 22:10 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau, dri-devel, linux-tegra, linux-kernel, gnurou

[-- Attachment #1: Type: text/plain, Size: 1910 bytes --]

On Mon, Mar 24, 2014 at 05:42:26PM +0900, Alexandre Courbot wrote:
[...]
> diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
[...]
>  static int
> -nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
> -	      struct nouveau_oclass *oclass, void *data, u32 size,
> -	      struct nouveau_object **pobject)
> +nvc0_bar_init_vm(struct nvc0_bar_priv *priv, int nr, int bar)
>  {
[...]
> -	/* BAR3 */
>  	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0,
> -				&priv->bar[0].mem);
> -	mem = priv->bar[0].mem;
> +				&priv->bar[nr].mem);
> +	mem = priv->bar[nr].mem;
>  	if (ret)
>  		return ret;
>  
>  	ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0,
> -				&priv->bar[0].pgd);
> +				&priv->bar[nr].pgd);
>  	if (ret)
>  		return ret;
[...]
> +static int
> +nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
> +	      struct nouveau_oclass *oclass, void *data, u32 size,
> +	      struct nouveau_object **pobject)
> +{
[...]
> +	/* BAR3 */
> +	if (has_bar3) {
> +		ret = nvc0_bar_init_vm(priv, 0, 3);
[...]
> +	/* BAR1 */
> +	ret = nvc0_bar_init_vm(priv, 1, 1);
>  	if (ret)
>  		return ret;

The calls to nvc0_bar_init_vm() are somewhat confusing in my opinion. It
is hard to see from the invocation what these numbers mean and therefore
distinguish which parameter is which.

Perhaps a slightly more readable way would be to pass in a pointer to a
structure as second parameter instead of the index into an array. So
it'd look somewhat like this:

	if (has_bar3) {
		ret = nvc0_bar_init_vm(priv, &priv->bar[0], 3);
		...
	}
	...
	ret = nvc0_bar_init_vm(priv, &priv->bar[1], 1);
	...

Unfortunately that would require a new type to be created for the bar[]
structures, so it'd be slightly more intrusive.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 03/12] drm/nouveau/bar: only ioremap BAR3 if it exists
  2014-03-24  8:42   ` Alexandre Courbot
@ 2014-03-24 22:13       ` Thierry Reding
  -1 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 22:13 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 1229 bytes --]

On Mon, Mar 24, 2014 at 05:42:25PM +0900, Alexandre Courbot wrote:
> Some chips that use system memory exclusively (e.g. GK20A) do not
> expose 2 BAR regions. For them only BAR1 exists, and it should be used
> for USERD mapping. Do not map BAR3 if its resource does not exist.
> 
> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/gpu/drm/nouveau/core/subdev/bar/base.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
> index bdf594116f3f..d713eeb75b13 100644
> --- a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
> +++ b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
> @@ -110,6 +110,7 @@ nouveau_bar_create_(struct nouveau_object *parent,
>  {
>  	struct nouveau_device *device = nv_device(parent);
>  	struct nouveau_bar *bar;
> +	bool has_bar3 = nv_device_resource_len(device, 3) != 0;

I don't think this variable is really necessary since you only use the
expression once anyway, but I don't feel very strongly about it, so
either way:

Reviewed-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 181 bytes --]

_______________________________________________
Nouveau mailing list
Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 03/12] drm/nouveau/bar: only ioremap BAR3 if it exists
@ 2014-03-24 22:13       ` Thierry Reding
  0 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 22:13 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau, dri-devel, linux-tegra, linux-kernel, gnurou

[-- Attachment #1: Type: text/plain, Size: 1171 bytes --]

On Mon, Mar 24, 2014 at 05:42:25PM +0900, Alexandre Courbot wrote:
> Some chips that use system memory exclusively (e.g. GK20A) do not
> expose 2 BAR regions. For them only BAR1 exists, and it should be used
> for USERD mapping. Do not map BAR3 if its resource does not exist.
> 
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/core/subdev/bar/base.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
> index bdf594116f3f..d713eeb75b13 100644
> --- a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
> +++ b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
> @@ -110,6 +110,7 @@ nouveau_bar_create_(struct nouveau_object *parent,
>  {
>  	struct nouveau_device *device = nv_device(parent);
>  	struct nouveau_bar *bar;
> +	bool has_bar3 = nv_device_resource_len(device, 3) != 0;

I don't think this variable is really necessary since you only use the
expression once anyway, but I don't feel very strongly about it, so
either way:

Reviewed-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 05/12] drm/nouveau/fifo: add GK20A support
  2014-03-24  8:42   ` Alexandre Courbot
@ 2014-03-24 22:14       ` Thierry Reding
  -1 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 22:14 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 736 bytes --]

On Mon, Mar 24, 2014 at 05:42:27PM +0900, Alexandre Courbot wrote:
> GK20A's FIFO is compatible with NVE0, but only features 128 channels and
> 1 runlist.
> 
> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/gpu/drm/nouveau/Makefile                   |  1 +
>  drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h    |  1 +
>  drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c    | 35 ++++++++++++++++++++++
>  drivers/gpu/drm/nouveau/core/include/engine/fifo.h |  1 +
>  4 files changed, 38 insertions(+)
>  create mode 100644 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c

Looks good to me:

Reviewed-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 181 bytes --]

_______________________________________________
Nouveau mailing list
Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 05/12] drm/nouveau/fifo: add GK20A support
@ 2014-03-24 22:14       ` Thierry Reding
  0 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 22:14 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau, dri-devel, linux-tegra, linux-kernel, gnurou

[-- Attachment #1: Type: text/plain, Size: 678 bytes --]

On Mon, Mar 24, 2014 at 05:42:27PM +0900, Alexandre Courbot wrote:
> GK20A's FIFO is compatible with NVE0, but only features 128 channels and
> 1 runlist.
> 
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/Makefile                   |  1 +
>  drivers/gpu/drm/nouveau/core/engine/fifo/nve0.h    |  1 +
>  drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c    | 35 ++++++++++++++++++++++
>  drivers/gpu/drm/nouveau/core/include/engine/fifo.h |  1 +
>  4 files changed, 38 insertions(+)
>  create mode 100644 drivers/gpu/drm/nouveau/core/engine/fifo/nvea.c

Looks good to me:

Reviewed-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 06/12] drm/nouveau/ibus: add GK20A support
  2014-03-24  8:42     ` Alexandre Courbot
@ 2014-03-24 22:34         ` Thierry Reding
  -1 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 22:34 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 2045 bytes --]

On Mon, Mar 24, 2014 at 05:42:28PM +0900, Alexandre Courbot wrote:
[...]
> diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
[...]
> +#include <subdev/ibus.h>
> +
> +struct nvea_ibus_priv {
> +	struct nouveau_ibus base;
> +};
> +
> +static void
> +nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv)
> +{
> +	nv_mask(priv, 0x137250, 0x3f, 0);
> +
> +	nv_mask(priv, 0x000200, 0x20, 0);
> +	udelay(20);

usleep_range()?

> +static void
> +nvea_ibus_intr(struct nouveau_subdev *subdev)
> +{
[...]
> +	/* Acknowledge interrupt */
> +	nv_mask(priv, 0x12004c, 0x2, 0x2);
> +
> +	while (--retry >= 0) {
> +		command = nv_rd32(priv, 0x12004c) & 0x3f;
> +		if (command == 0)
> +			break;
> +	}
> +
> +	if (retry < 0)
> +		nv_warn(priv, "timeout waiting for ringmaster ack\n");
> +}

Perhaps I'm being paranoid, but this loop now depends on the frequency
of the various clocks involved and therefore might break at some point
if the frequencies get sufficiently high.

So a slightly safer implementation would use a proper timeout using a
combination of msecs_to_jiffies(), time_before() and usleep_range(),
like so:

	timeout = jiffies + msecs_to_jiffies(...);

	while (time_before(jiffies, timeout)) {
		command = nv_rd32(...) & 0x3f;
		if (command == 0)
			break;

		usleep_range(...);
	}

	if (time_after(jiffies, timeout))
		nv_warn(...);

This assumes that there's some known timeout after which the ringmaster
is expected to have acked the interrupt. On that note, I wonder if the
warning is accurate here: it's my understanding that writing 0x2 to the
register does acknowledge the interrupt, so the ringmaster does in fact
"clear" it rather than "acknowledge" it, doesn't it?

Although now that I mention it I seem to remember that this write is
actually sending a command to the ring master and perhaps waiting for
the register to return to 0 is indeed waiting for an ACK of sorts. Maybe
adding a comment or so describing what this sequence does would be
appropriate here?

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 181 bytes --]

_______________________________________________
Nouveau mailing list
Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 06/12] drm/nouveau/ibus: add GK20A support
@ 2014-03-24 22:34         ` Thierry Reding
  0 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 22:34 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau, dri-devel, linux-tegra, linux-kernel, gnurou

[-- Attachment #1: Type: text/plain, Size: 2045 bytes --]

On Mon, Mar 24, 2014 at 05:42:28PM +0900, Alexandre Courbot wrote:
[...]
> diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
[...]
> +#include <subdev/ibus.h>
> +
> +struct nvea_ibus_priv {
> +	struct nouveau_ibus base;
> +};
> +
> +static void
> +nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv)
> +{
> +	nv_mask(priv, 0x137250, 0x3f, 0);
> +
> +	nv_mask(priv, 0x000200, 0x20, 0);
> +	udelay(20);

usleep_range()?

> +static void
> +nvea_ibus_intr(struct nouveau_subdev *subdev)
> +{
[...]
> +	/* Acknowledge interrupt */
> +	nv_mask(priv, 0x12004c, 0x2, 0x2);
> +
> +	while (--retry >= 0) {
> +		command = nv_rd32(priv, 0x12004c) & 0x3f;
> +		if (command == 0)
> +			break;
> +	}
> +
> +	if (retry < 0)
> +		nv_warn(priv, "timeout waiting for ringmaster ack\n");
> +}

Perhaps I'm being paranoid, but this loop now depends on the frequency
of the various clocks involved and therefore might break at some point
if the frequencies get sufficiently high.

So a slightly safer implementation would use a proper timeout using a
combination of msecs_to_jiffies(), time_before() and usleep_range(),
like so:

	timeout = jiffies + msecs_to_jiffies(...);

	while (time_before(jiffies, timeout)) {
		command = nv_rd32(...) & 0x3f;
		if (command == 0)
			break;

		usleep_range(...);
	}

	if (time_after(jiffies, timeout))
		nv_warn(...);

This assumes that there's some known timeout after which the ringmaster
is expected to have acked the interrupt. On that note, I wonder if the
warning is accurate here: it's my understanding that writing 0x2 to the
register does acknowledge the interrupt, so the ringmaster does in fact
"clear" it rather than "acknowledge" it, doesn't it?

Although now that I mention it I seem to remember that this write is
actually sending a command to the ring master and perhaps waiting for
the register to return to 0 is indeed waiting for an ACK of sorts. Maybe
adding a comment or so describing what this sequence does would be
appropriate here?

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 08/12] drm/nouveau/graph: enable when using external firmware
  2014-03-24  8:42     ` Alexandre Courbot
@ 2014-03-24 22:58         ` Thierry Reding
  -1 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 22:58 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 1266 bytes --]

On Mon, Mar 24, 2014 at 05:42:30PM +0900, Alexandre Courbot wrote:
[...]
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> index 6ef8bf181b2d..f997a18f5760 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> @@ -1133,10 +1133,14 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
>  	struct nvc0_graph_oclass *oclass = (void *)bclass;
>  	struct nouveau_device *device = nv_device(parent);
>  	struct nvc0_graph_priv *priv;
> +	bool use_fw;

Perhaps "ext_fw" or "use_ext_fw" would be more accurate.

>  	int ret, i;
>  
> +	use_fw = nouveau_boolopt(device->cfgopt, "NvGrUseFW", false);
> +
>  	ret = nouveau_graph_create(parent, engine, bclass,
> -				   (oclass->fecs.ucode != NULL), &priv);
> +				   (oclass->fecs.ucode != NULL) || use_fw,
> +				   &priv);

Or perhaps a more intuitive way would be to name the variable "enable"
and have something like:

	if (!nouveau_boolopt(device->cfgopt, "NvGrUseFW", false))
		enable = oclass->fecs.ucode != NULL;
	else
		enable = true;

	ret = nouveau_graph_create(parent, engine, bclass, enable, &priv);
	...

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 181 bytes --]

_______________________________________________
Nouveau mailing list
Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 08/12] drm/nouveau/graph: enable when using external firmware
@ 2014-03-24 22:58         ` Thierry Reding
  0 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 22:58 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau, dri-devel, linux-tegra, linux-kernel, gnurou

[-- Attachment #1: Type: text/plain, Size: 1266 bytes --]

On Mon, Mar 24, 2014 at 05:42:30PM +0900, Alexandre Courbot wrote:
[...]
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> index 6ef8bf181b2d..f997a18f5760 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> @@ -1133,10 +1133,14 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
>  	struct nvc0_graph_oclass *oclass = (void *)bclass;
>  	struct nouveau_device *device = nv_device(parent);
>  	struct nvc0_graph_priv *priv;
> +	bool use_fw;

Perhaps "ext_fw" or "use_ext_fw" would be more accurate.

>  	int ret, i;
>  
> +	use_fw = nouveau_boolopt(device->cfgopt, "NvGrUseFW", false);
> +
>  	ret = nouveau_graph_create(parent, engine, bclass,
> -				   (oclass->fecs.ucode != NULL), &priv);
> +				   (oclass->fecs.ucode != NULL) || use_fw,
> +				   &priv);

Or perhaps a more intuitive way would be to name the variable "enable"
and have something like:

	if (!nouveau_boolopt(device->cfgopt, "NvGrUseFW", false))
		enable = oclass->fecs.ucode != NULL;
	else
		enable = true;

	ret = nouveau_graph_create(parent, engine, bclass, enable, &priv);
	...

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time
  2014-03-24  8:42     ` Alexandre Courbot
@ 2014-03-24 23:02         ` Thierry Reding
  -1 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 23:02 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 559 bytes --]

On Mon, Mar 24, 2014 at 05:42:31PM +0900, Alexandre Courbot wrote:
> Pad the microcode to a multiple of 0x40, otherwise firmware will fail to
> run from non-prepadded firmware files.

Perhaps this (and the comment in the code) should mention a unit. In
this case it's 0x40 words. Also, I think using a decimal number would
read easier here and in the comment. Furthermore, perhaps choosing a
byte as the unit would be even more intuitive.

But either way the code is correct, so:

Reviewed-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 181 bytes --]

_______________________________________________
Nouveau mailing list
Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time
@ 2014-03-24 23:02         ` Thierry Reding
  0 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 23:02 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau, dri-devel, linux-tegra, linux-kernel, gnurou

[-- Attachment #1: Type: text/plain, Size: 530 bytes --]

On Mon, Mar 24, 2014 at 05:42:31PM +0900, Alexandre Courbot wrote:
> Pad the microcode to a multiple of 0x40, otherwise firmware will fail to
> run from non-prepadded firmware files.

Perhaps this (and the comment in the code) should mention a unit. In
this case it's 0x40 words. Also, I think using a decimal number would
read easier here and in the comment. Furthermore, perhaps choosing a
byte as the unit would be even more intuitive.

But either way the code is correct, so:

Reviewed-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()
  2014-03-24  8:42     ` Alexandre Courbot
@ 2014-03-24 23:10         ` Thierry Reding
  -1 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 23:10 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	gnurou-Re5JQEeQqe8AvxtiuMwx3w

[-- Attachment #1: Type: text/plain, Size: 1147 bytes --]

On Mon, Mar 24, 2014 at 05:42:33PM +0900, Alexandre Courbot wrote:
> GK20A does not embed a dedicated COPY engine and thus cannot allocate
> the copy channel that nouveau_accel_init() attempts to create. It also
> lacks any display hardware, so the creation of a software channel does
> not apply neither.

Perhaps this should be two separate patches?

> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
[...]
> +	if (device->chipset == 0xea) {
> +		/* gk20a does not have CE0/CE1 */

This would be another good candidate for a feature flag.

> +		arg0 = NVE0_CHANNEL_IND_ENGINE_GR;
> +		arg1 = 1;
> +	} else
>  	if (device->card_type >= NV_E0) {

The formatting here is somewhat weird. From a brief look I couldn't find
any indication that nouveau deviates from the standard coding style, so
this should be:

	} else if (...) {

> +	/* Need to figure out how to handle sw for gk20a */
> +	if (device->chipset == 0xea)
> +		goto skip_sw_init;

The commit message makes it sound like SW isn't needed since gk20a
"lacks any display hardware". In that case the comment here doesn't make
much sense.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()
@ 2014-03-24 23:10         ` Thierry Reding
  0 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 23:10 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau, dri-devel, linux-tegra, linux-kernel, gnurou

[-- Attachment #1: Type: text/plain, Size: 1147 bytes --]

On Mon, Mar 24, 2014 at 05:42:33PM +0900, Alexandre Courbot wrote:
> GK20A does not embed a dedicated COPY engine and thus cannot allocate
> the copy channel that nouveau_accel_init() attempts to create. It also
> lacks any display hardware, so the creation of a software channel does
> not apply neither.

Perhaps this should be two separate patches?

> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
[...]
> +	if (device->chipset == 0xea) {
> +		/* gk20a does not have CE0/CE1 */

This would be another good candidate for a feature flag.

> +		arg0 = NVE0_CHANNEL_IND_ENGINE_GR;
> +		arg1 = 1;
> +	} else
>  	if (device->card_type >= NV_E0) {

The formatting here is somewhat weird. From a brief look I couldn't find
any indication that nouveau deviates from the standard coding style, so
this should be:

	} else if (...) {

> +	/* Need to figure out how to handle sw for gk20a */
> +	if (device->chipset == 0xea)
> +		goto skip_sw_init;

The commit message makes it sound like SW isn't needed since gk20a
"lacks any display hardware". In that case the comment here doesn't make
much sense.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 12/12] drm/nouveau: support for probing GK20A
  2014-03-24  8:42     ` Alexandre Courbot
@ 2014-03-24 23:11         ` Thierry Reding
  -1 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 23:11 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 332 bytes --]

On Mon, Mar 24, 2014 at 05:42:34PM +0900, Alexandre Courbot wrote:
[...]
> diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
[...]
> +		/* TODO will need an implementation for this at some point... */

Do we? If so the commit message in 11/12 is misleading.

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 181 bytes --]

_______________________________________________
Nouveau mailing list
Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 12/12] drm/nouveau: support for probing GK20A
@ 2014-03-24 23:11         ` Thierry Reding
  0 siblings, 0 replies; 100+ messages in thread
From: Thierry Reding @ 2014-03-24 23:11 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau, dri-devel, linux-tegra, linux-kernel, gnurou

[-- Attachment #1: Type: text/plain, Size: 332 bytes --]

On Mon, Mar 24, 2014 at 05:42:34PM +0900, Alexandre Courbot wrote:
[...]
> diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
[...]
> +		/* TODO will need an implementation for this at some point... */

Do we? If so the commit message in 11/12 is misleading.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
  2014-03-24 21:54         ` Thierry Reding
@ 2014-03-26  4:19           ` Ben Skeggs
  -1 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:19 UTC (permalink / raw)
  To: Thierry Reding
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Tue, Mar 25, 2014 at 7:54 AM, Thierry Reding
<thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
>> GK20A's timer is directly attached to the system timer and cannot be
>> calibrated. Skip the calibration phase on that chip since the
>> corresponding registers do not exist.
>>
>> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> ---
>>  drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
>>  1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>> index c0bdd10358d7..822fe0d8a871 100644
>> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>>       if (ret)
>>               return ret;
>>
>> +     /* gk20a does not have the calibration registers */
>> +     if (device->chipset == 0xea)
>> +             goto skip_clk_init;
>
> I'm concerned that this won't scale in the future. Perhaps a better
> solution would be to add a "flags" or "features" field to struct
> nouveau_device along with feature bits such as HAS_CALIBRATION or
> similar.
>
> That way we don't have to touch this code for every new future Tegra
> chip. Unless perhaps if there's a reason to expect things to change in
> newer generations.
I've already handled this in a slightly different way in the tree I'd
previously pointed Alex at (I think!), as I needed to do the same for
GM107.

Should just be able to use that implementation (so, just change the
probe patch) here too.

Thanks,
Ben.

>
> Thierry
>
> _______________________________________________
> dri-devel mailing list
> dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
@ 2014-03-26  4:19           ` Ben Skeggs
  0 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:19 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Alexandre Courbot, Alexandre Courbot, nouveau, linux-kernel,
	dri-devel, Ben Skeggs, linux-tegra

On Tue, Mar 25, 2014 at 7:54 AM, Thierry Reding
<thierry.reding@gmail.com> wrote:
> On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
>> GK20A's timer is directly attached to the system timer and cannot be
>> calibrated. Skip the calibration phase on that chip since the
>> corresponding registers do not exist.
>>
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> ---
>>  drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
>>  1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>> index c0bdd10358d7..822fe0d8a871 100644
>> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>>       if (ret)
>>               return ret;
>>
>> +     /* gk20a does not have the calibration registers */
>> +     if (device->chipset == 0xea)
>> +             goto skip_clk_init;
>
> I'm concerned that this won't scale in the future. Perhaps a better
> solution would be to add a "flags" or "features" field to struct
> nouveau_device along with feature bits such as HAS_CALIBRATION or
> similar.
>
> That way we don't have to touch this code for every new future Tegra
> chip. Unless perhaps if there's a reason to expect things to change in
> newer generations.
I've already handled this in a slightly different way in the tree I'd
previously pointed Alex at (I think!), as I needed to do the same for
GM107.

Should just be able to use that implementation (so, just change the
probe patch) here too.

Thanks,
Ben.

>
> Thierry
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 03/12] drm/nouveau/bar: only ioremap BAR3 if it exists
  2014-03-24 22:13       ` Thierry Reding
@ 2014-03-26  4:20         ` Ben Skeggs
  -1 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:20 UTC (permalink / raw)
  To: Thierry Reding
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Tue, Mar 25, 2014 at 8:13 AM, Thierry Reding
<thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, Mar 24, 2014 at 05:42:25PM +0900, Alexandre Courbot wrote:
>> Some chips that use system memory exclusively (e.g. GK20A) do not
>> expose 2 BAR regions. For them only BAR1 exists, and it should be used
>> for USERD mapping. Do not map BAR3 if its resource does not exist.
>>
>> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> ---
>>  drivers/gpu/drm/nouveau/core/subdev/bar/base.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
>> index bdf594116f3f..d713eeb75b13 100644
>> --- a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
>> +++ b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
>> @@ -110,6 +110,7 @@ nouveau_bar_create_(struct nouveau_object *parent,
>>  {
>>       struct nouveau_device *device = nv_device(parent);
>>       struct nouveau_bar *bar;
>> +     bool has_bar3 = nv_device_resource_len(device, 3) != 0;
>
> I don't think this variable is really necessary since you only use the
> expression once anyway, but I don't feel very strongly about it, so
> either way:
I'd also prefer it folded in, but it's not a big deal.  A couple of
other upcoming comments will require a re-spin anyway, so we may as
well :)

>
> Reviewed-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>
> _______________________________________________
> Nouveau mailing list
> Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Nouveau] [PATCH 03/12] drm/nouveau/bar: only ioremap BAR3 if it exists
@ 2014-03-26  4:20         ` Ben Skeggs
  0 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:20 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Alexandre Courbot, nouveau, linux-kernel, dri-devel, Ben Skeggs,
	linux-tegra

On Tue, Mar 25, 2014 at 8:13 AM, Thierry Reding
<thierry.reding@gmail.com> wrote:
> On Mon, Mar 24, 2014 at 05:42:25PM +0900, Alexandre Courbot wrote:
>> Some chips that use system memory exclusively (e.g. GK20A) do not
>> expose 2 BAR regions. For them only BAR1 exists, and it should be used
>> for USERD mapping. Do not map BAR3 if its resource does not exist.
>>
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> ---
>>  drivers/gpu/drm/nouveau/core/subdev/bar/base.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
>> index bdf594116f3f..d713eeb75b13 100644
>> --- a/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
>> +++ b/drivers/gpu/drm/nouveau/core/subdev/bar/base.c
>> @@ -110,6 +110,7 @@ nouveau_bar_create_(struct nouveau_object *parent,
>>  {
>>       struct nouveau_device *device = nv_device(parent);
>>       struct nouveau_bar *bar;
>> +     bool has_bar3 = nv_device_resource_len(device, 3) != 0;
>
> I don't think this variable is really necessary since you only use the
> expression once anyway, but I don't feel very strongly about it, so
> either way:
I'd also prefer it folded in, but it's not a big deal.  A couple of
other upcoming comments will require a re-spin anyway, so we may as
well :)

>
> Reviewed-by: Thierry Reding <treding@nvidia.com>
>
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 08/12] drm/nouveau/graph: enable when using external firmware
  2014-03-24 22:58         ` Thierry Reding
@ 2014-03-26  4:21           ` Ben Skeggs
  -1 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:21 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Alexandre Courbot, Alexandre Courbot,
	nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Tue, Mar 25, 2014 at 8:58 AM, Thierry Reding
<thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, Mar 24, 2014 at 05:42:30PM +0900, Alexandre Courbot wrote:
> [...]
>> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>> index 6ef8bf181b2d..f997a18f5760 100644
>> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>> @@ -1133,10 +1133,14 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
>>       struct nvc0_graph_oclass *oclass = (void *)bclass;
>>       struct nouveau_device *device = nv_device(parent);
>>       struct nvc0_graph_priv *priv;
>> +     bool use_fw;
>
> Perhaps "ext_fw" or "use_ext_fw" would be more accurate.
>
>>       int ret, i;
>>
>> +     use_fw = nouveau_boolopt(device->cfgopt, "NvGrUseFW", false);
>> +
>>       ret = nouveau_graph_create(parent, engine, bclass,
>> -                                (oclass->fecs.ucode != NULL), &priv);
>> +                                (oclass->fecs.ucode != NULL) || use_fw,
>> +                                &priv);
>
> Or perhaps a more intuitive way would be to name the variable "enable"
> and have something like:
>
>         if (!nouveau_boolopt(device->cfgopt, "NvGrUseFW", false))
>                 enable = oclass->fecs.ucode != NULL;
>         else
>                 enable = true;
>
>         ret = nouveau_graph_create(parent, engine, bclass, enable, &priv);
>         ...
Agreed, this looks a lot nicer.

>
> Thierry
>
> _______________________________________________
> dri-devel mailing list
> dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 08/12] drm/nouveau/graph: enable when using external firmware
@ 2014-03-26  4:21           ` Ben Skeggs
  0 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:21 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Alexandre Courbot, Alexandre Courbot, nouveau, linux-kernel,
	dri-devel, Ben Skeggs, linux-tegra

On Tue, Mar 25, 2014 at 8:58 AM, Thierry Reding
<thierry.reding@gmail.com> wrote:
> On Mon, Mar 24, 2014 at 05:42:30PM +0900, Alexandre Courbot wrote:
> [...]
>> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>> index 6ef8bf181b2d..f997a18f5760 100644
>> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>> @@ -1133,10 +1133,14 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
>>       struct nvc0_graph_oclass *oclass = (void *)bclass;
>>       struct nouveau_device *device = nv_device(parent);
>>       struct nvc0_graph_priv *priv;
>> +     bool use_fw;
>
> Perhaps "ext_fw" or "use_ext_fw" would be more accurate.
>
>>       int ret, i;
>>
>> +     use_fw = nouveau_boolopt(device->cfgopt, "NvGrUseFW", false);
>> +
>>       ret = nouveau_graph_create(parent, engine, bclass,
>> -                                (oclass->fecs.ucode != NULL), &priv);
>> +                                (oclass->fecs.ucode != NULL) || use_fw,
>> +                                &priv);
>
> Or perhaps a more intuitive way would be to name the variable "enable"
> and have something like:
>
>         if (!nouveau_boolopt(device->cfgopt, "NvGrUseFW", false))
>                 enable = oclass->fecs.ucode != NULL;
>         else
>                 enable = true;
>
>         ret = nouveau_graph_create(parent, engine, bclass, enable, &priv);
>         ...
Agreed, this looks a lot nicer.

>
> Thierry
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time
  2014-03-24  8:42     ` Alexandre Courbot
@ 2014-03-26  4:22         ` Ben Skeggs
  -1 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:22 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> wrote:
> Pad the microcode to a multiple of 0x40, otherwise firmware will fail to
> run from non-prepadded firmware files.
>
> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> index f997a18f5760..367e72daf8b1 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> @@ -768,6 +768,10 @@ nvc0_graph_init_fw(struct nvc0_graph_priv *priv, u32 fuc_base,
>                         nv_wr32(priv, fuc_base + 0x0188, i >> 6);
>                 nv_wr32(priv, fuc_base + 0x0184, code->data[i]);
>         }
> +
> +       /* code must be padded to 0x40 */
> +       for (; i < (((code->size / 4) + 0x3f) & ~0x3f); i++)
"for (; i & 0x3f; i++)" would work just as well :)

> +               nv_wr32(priv, fuc_base + 0x0184, 0);
>  }
>
>  static void
> --
> 1.9.1
>
> _______________________________________________
> Nouveau mailing list
> Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Nouveau] [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time
@ 2014-03-26  4:22         ` Ben Skeggs
  0 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:22 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau, linux-kernel, dri-devel, linux-tegra

On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
> Pad the microcode to a multiple of 0x40, otherwise firmware will fail to
> run from non-prepadded firmware files.
>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> index f997a18f5760..367e72daf8b1 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
> @@ -768,6 +768,10 @@ nvc0_graph_init_fw(struct nvc0_graph_priv *priv, u32 fuc_base,
>                         nv_wr32(priv, fuc_base + 0x0188, i >> 6);
>                 nv_wr32(priv, fuc_base + 0x0184, code->data[i]);
>         }
> +
> +       /* code must be padded to 0x40 */
> +       for (; i < (((code->size / 4) + 0x3f) & ~0x3f); i++)
"for (; i & 0x3f; i++)" would work just as well :)

> +               nv_wr32(priv, fuc_base + 0x0184, 0);
>  }
>
>  static void
> --
> 1.9.1
>
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 10/12] drm/nouveau/graph: add GK20A support
  2014-03-24  8:42     ` Alexandre Courbot
@ 2014-03-26  4:24         ` Ben Skeggs
  -1 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:24 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, Alexandre Courbot,
	nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> wrote:
> Add a GR device for GK20A based on NVE4, with the correct classes
> definitions (GK20A's 3D class is 0xa297).
>
> Most of the NVE4 code can be used on GK20A, so make relevant bits of
> NVE4 available to other chips as well.
This will need a bit of a rebase on top of the tree I mentioned
earlier (also queued for drm-next now), where I've further split out
and named the various chunks of state.

Does GK104 match entirely correctly, or just happen to work?  I could
probably hunt down the GK20A netlist images and check that actually :)

>
> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/gpu/drm/nouveau/Makefile                   |  1 +
>  .../gpu/drm/nouveau/core/engine/graph/ctxnve4.c    |  4 +-
>  drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h   |  9 +++
>  drivers/gpu/drm/nouveau/core/engine/graph/nve4.c   |  2 +-
>  drivers/gpu/drm/nouveau/core/engine/graph/nvea.c   | 75 ++++++++++++++++++++++
>  .../gpu/drm/nouveau/core/include/engine/graph.h    |  1 +
>  6 files changed, 89 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
>
> diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
> index 708d2e33835f..032a4744f843 100644
> --- a/drivers/gpu/drm/nouveau/Makefile
> +++ b/drivers/gpu/drm/nouveau/Makefile
> @@ -270,6 +270,7 @@ nouveau-y += core/engine/graph/nvc8.o
>  nouveau-y += core/engine/graph/nvd7.o
>  nouveau-y += core/engine/graph/nvd9.o
>  nouveau-y += core/engine/graph/nve4.o
> +nouveau-y += core/engine/graph/nvea.o
>  nouveau-y += core/engine/graph/nvf0.o
>  nouveau-y += core/engine/graph/nv108.o
>  nouveau-y += core/engine/mpeg/nv31.o
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
> index e2de73ee5eee..3904073f860d 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
> @@ -804,7 +804,7 @@ nve4_grctx_init_unk[] = {
>         {}
>  };
>
> -static void
> +void
>  nve4_grctx_generate_mods(struct nvc0_graph_priv *priv, struct nvc0_grctx *info)
>  {
>         u32 magic[GPC_MAX][2];
> @@ -962,7 +962,7 @@ nve4_grctx_generate_main(struct nvc0_graph_priv *priv, struct nvc0_grctx *info)
>         nv_mask(priv, 0x41be10, 0x00800000, 0x00800000);
>  }
>
> -static struct nvc0_graph_init *
> +struct nvc0_graph_init *
>  nve4_grctx_init_hub[] = {
>         nvc0_grctx_init_base,
>         nve4_grctx_init_unk40xx,
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
> index b0ab6de270b2..904f09b540c5 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
> @@ -221,6 +221,8 @@ void nvc0_grctx_generate_r418bb8(struct nvc0_graph_priv *);
>  void nve4_grctx_generate_r418bb8(struct nvc0_graph_priv *);
>  void nvc0_grctx_generate_r406800(struct nvc0_graph_priv *);
>
> +void nve4_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *);
> +
>  extern struct nouveau_oclass *nvc0_grctx_oclass;
>  extern struct nvc0_graph_init *nvc0_grctx_init_hub[];
>  extern struct nvc0_graph_init nvc0_grctx_init_base[];
> @@ -237,12 +239,17 @@ extern struct nvc0_graph_init nvc0_grctx_init_gpc_1[];
>  extern struct nvc0_graph_init nvc0_grctx_init_tpc[];
>  extern struct nvc0_graph_init nvc0_grctx_init_icmd[];
>  extern struct nvc0_graph_init nvd9_grctx_init_icmd[]; //
> +extern struct nvc0_graph_init nve4_grctx_init_icmd[];
> +
> +extern struct nvc0_graph_init *nve4_grctx_init_hub[];
> +extern struct nvc0_graph_init *nve4_grctx_init_gpc[];
>
>  extern struct nvc0_graph_mthd nvc0_grctx_init_mthd[];
>  extern struct nvc0_graph_init nvc0_grctx_init_902d[];
>  extern struct nvc0_graph_init nvc0_grctx_init_9039[];
>  extern struct nvc0_graph_init nvc0_grctx_init_90c0[];
>  extern struct nvc0_graph_init nvc0_grctx_init_mthd_magic[];
> +extern struct nvc0_graph_init nve4_grctx_init_a097[];
>
>  void nvc1_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *);
>  void nvc1_grctx_generate_unkn(struct nvc0_graph_priv *);
> @@ -277,6 +284,8 @@ extern struct nvc0_graph_init nvf0_grctx_init_unk60xx[];
>
>  extern struct nouveau_oclass *nv108_grctx_oclass;
>
> +extern struct nvc0_graph_init *nve4_graph_init_mmio[];
> +
>  #define mmio_data(s,a,p) do {                                                  \
>         info->buffer[info->buffer_nr] = round_up(info->addr, (a));             \
>         info->addr = info->buffer[info->buffer_nr++] + (s);                    \
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
> index 05ec09c88517..442857c5c120 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
> @@ -298,7 +298,7 @@ nve4_graph_init(struct nouveau_object *object)
>         return nvc0_graph_init_ctxctl(priv);
>  }
>
> -static struct nvc0_graph_init *
> +struct nvc0_graph_init *
>  nve4_graph_init_mmio[] = {
>         nve4_graph_init_regs,
>         nvc0_graph_init_unk40xx,
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
> new file mode 100644
> index 000000000000..d5e6a1adcacb
> --- /dev/null
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
> @@ -0,0 +1,75 @@
> +/*
> + * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#include "nvc0.h"
> +
> +static struct nouveau_oclass
> +nvea_graph_sclass[] = {
> +       { 0x902d, &nouveau_object_ofuncs },
> +       { 0xa040, &nouveau_object_ofuncs },
> +       { 0xa297, &nouveau_object_ofuncs },
> +       { 0xa0c0, &nouveau_object_ofuncs },
> +       {}
> +};
> +
> +static struct nvc0_graph_mthd
> +nvea_grctx_init_mthd[] = {
> +       { 0xa297, nve4_grctx_init_a097, },
> +       { 0x902d, nvc0_grctx_init_902d, },
> +       { 0x902d, nvc0_grctx_init_mthd_magic, },
> +       {}
> +};
> +
> +struct nouveau_oclass *
> +nvea_grctx_oclass = &(struct nvc0_grctx_oclass) {
> +       .base.handle = NV_ENGCTX(GR, 0xea),
> +       .base.ofuncs = &(struct nouveau_ofuncs) {
> +               .ctor = nvc0_graph_context_ctor,
> +               .dtor = nvc0_graph_context_dtor,
> +               .init = _nouveau_graph_context_init,
> +               .fini = _nouveau_graph_context_fini,
> +               .rd32 = _nouveau_graph_context_rd32,
> +               .wr32 = _nouveau_graph_context_wr32,
> +       },
> +       .main = nve4_grctx_generate_main,
> +       .mods = nve4_grctx_generate_mods,
> +       .unkn = nve4_grctx_generate_unkn,
> +       .hub  = nve4_grctx_init_hub,
> +       .gpc  = nve4_grctx_init_gpc,
> +       .icmd = nve4_grctx_init_icmd,
> +       .mthd = nvea_grctx_init_mthd,
> +}.base;
> +
> +
> +struct nouveau_oclass *
> +nvea_graph_oclass = &(struct nvc0_graph_oclass) {
> +       .base.handle = NV_ENGINE(GR, 0xea),
> +       .base.ofuncs = &(struct nouveau_ofuncs) {
> +               .ctor = nvc0_graph_ctor,
> +               .dtor = nvc0_graph_dtor,
> +               .init = nve4_graph_init,
> +               .fini = _nouveau_graph_fini,
> +       },
> +       .cclass = &nvea_grctx_oclass,
> +       .sclass = nvea_graph_sclass,
> +       .mmio = nve4_graph_init_mmio,
> +}.base;
> diff --git a/drivers/gpu/drm/nouveau/core/include/engine/graph.h b/drivers/gpu/drm/nouveau/core/include/engine/graph.h
> index 97705618de97..632d7e172172 100644
> --- a/drivers/gpu/drm/nouveau/core/include/engine/graph.h
> +++ b/drivers/gpu/drm/nouveau/core/include/engine/graph.h
> @@ -68,6 +68,7 @@ extern struct nouveau_oclass *nvc8_graph_oclass;
>  extern struct nouveau_oclass *nvd7_graph_oclass;
>  extern struct nouveau_oclass *nvd9_graph_oclass;
>  extern struct nouveau_oclass *nve4_graph_oclass;
> +extern struct nouveau_oclass *nvea_graph_oclass;
>  extern struct nouveau_oclass *nvf0_graph_oclass;
>  extern struct nouveau_oclass *nv108_graph_oclass;
>
> --
> 1.9.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 10/12] drm/nouveau/graph: add GK20A support
@ 2014-03-26  4:24         ` Ben Skeggs
  0 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:24 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, Alexandre Courbot, nouveau, linux-kernel, dri-devel,
	linux-tegra

On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
> Add a GR device for GK20A based on NVE4, with the correct classes
> definitions (GK20A's 3D class is 0xa297).
>
> Most of the NVE4 code can be used on GK20A, so make relevant bits of
> NVE4 available to other chips as well.
This will need a bit of a rebase on top of the tree I mentioned
earlier (also queued for drm-next now), where I've further split out
and named the various chunks of state.

Does GK104 match entirely correctly, or just happen to work?  I could
probably hunt down the GK20A netlist images and check that actually :)

>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/Makefile                   |  1 +
>  .../gpu/drm/nouveau/core/engine/graph/ctxnve4.c    |  4 +-
>  drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h   |  9 +++
>  drivers/gpu/drm/nouveau/core/engine/graph/nve4.c   |  2 +-
>  drivers/gpu/drm/nouveau/core/engine/graph/nvea.c   | 75 ++++++++++++++++++++++
>  .../gpu/drm/nouveau/core/include/engine/graph.h    |  1 +
>  6 files changed, 89 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
>
> diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile
> index 708d2e33835f..032a4744f843 100644
> --- a/drivers/gpu/drm/nouveau/Makefile
> +++ b/drivers/gpu/drm/nouveau/Makefile
> @@ -270,6 +270,7 @@ nouveau-y += core/engine/graph/nvc8.o
>  nouveau-y += core/engine/graph/nvd7.o
>  nouveau-y += core/engine/graph/nvd9.o
>  nouveau-y += core/engine/graph/nve4.o
> +nouveau-y += core/engine/graph/nvea.o
>  nouveau-y += core/engine/graph/nvf0.o
>  nouveau-y += core/engine/graph/nv108.o
>  nouveau-y += core/engine/mpeg/nv31.o
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
> index e2de73ee5eee..3904073f860d 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/ctxnve4.c
> @@ -804,7 +804,7 @@ nve4_grctx_init_unk[] = {
>         {}
>  };
>
> -static void
> +void
>  nve4_grctx_generate_mods(struct nvc0_graph_priv *priv, struct nvc0_grctx *info)
>  {
>         u32 magic[GPC_MAX][2];
> @@ -962,7 +962,7 @@ nve4_grctx_generate_main(struct nvc0_graph_priv *priv, struct nvc0_grctx *info)
>         nv_mask(priv, 0x41be10, 0x00800000, 0x00800000);
>  }
>
> -static struct nvc0_graph_init *
> +struct nvc0_graph_init *
>  nve4_grctx_init_hub[] = {
>         nvc0_grctx_init_base,
>         nve4_grctx_init_unk40xx,
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
> index b0ab6de270b2..904f09b540c5 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.h
> @@ -221,6 +221,8 @@ void nvc0_grctx_generate_r418bb8(struct nvc0_graph_priv *);
>  void nve4_grctx_generate_r418bb8(struct nvc0_graph_priv *);
>  void nvc0_grctx_generate_r406800(struct nvc0_graph_priv *);
>
> +void nve4_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *);
> +
>  extern struct nouveau_oclass *nvc0_grctx_oclass;
>  extern struct nvc0_graph_init *nvc0_grctx_init_hub[];
>  extern struct nvc0_graph_init nvc0_grctx_init_base[];
> @@ -237,12 +239,17 @@ extern struct nvc0_graph_init nvc0_grctx_init_gpc_1[];
>  extern struct nvc0_graph_init nvc0_grctx_init_tpc[];
>  extern struct nvc0_graph_init nvc0_grctx_init_icmd[];
>  extern struct nvc0_graph_init nvd9_grctx_init_icmd[]; //
> +extern struct nvc0_graph_init nve4_grctx_init_icmd[];
> +
> +extern struct nvc0_graph_init *nve4_grctx_init_hub[];
> +extern struct nvc0_graph_init *nve4_grctx_init_gpc[];
>
>  extern struct nvc0_graph_mthd nvc0_grctx_init_mthd[];
>  extern struct nvc0_graph_init nvc0_grctx_init_902d[];
>  extern struct nvc0_graph_init nvc0_grctx_init_9039[];
>  extern struct nvc0_graph_init nvc0_grctx_init_90c0[];
>  extern struct nvc0_graph_init nvc0_grctx_init_mthd_magic[];
> +extern struct nvc0_graph_init nve4_grctx_init_a097[];
>
>  void nvc1_grctx_generate_mods(struct nvc0_graph_priv *, struct nvc0_grctx *);
>  void nvc1_grctx_generate_unkn(struct nvc0_graph_priv *);
> @@ -277,6 +284,8 @@ extern struct nvc0_graph_init nvf0_grctx_init_unk60xx[];
>
>  extern struct nouveau_oclass *nv108_grctx_oclass;
>
> +extern struct nvc0_graph_init *nve4_graph_init_mmio[];
> +
>  #define mmio_data(s,a,p) do {                                                  \
>         info->buffer[info->buffer_nr] = round_up(info->addr, (a));             \
>         info->addr = info->buffer[info->buffer_nr++] + (s);                    \
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
> index 05ec09c88517..442857c5c120 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nve4.c
> @@ -298,7 +298,7 @@ nve4_graph_init(struct nouveau_object *object)
>         return nvc0_graph_init_ctxctl(priv);
>  }
>
> -static struct nvc0_graph_init *
> +struct nvc0_graph_init *
>  nve4_graph_init_mmio[] = {
>         nve4_graph_init_regs,
>         nvc0_graph_init_unk40xx,
> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
> new file mode 100644
> index 000000000000..d5e6a1adcacb
> --- /dev/null
> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvea.c
> @@ -0,0 +1,75 @@
> +/*
> + * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#include "nvc0.h"
> +
> +static struct nouveau_oclass
> +nvea_graph_sclass[] = {
> +       { 0x902d, &nouveau_object_ofuncs },
> +       { 0xa040, &nouveau_object_ofuncs },
> +       { 0xa297, &nouveau_object_ofuncs },
> +       { 0xa0c0, &nouveau_object_ofuncs },
> +       {}
> +};
> +
> +static struct nvc0_graph_mthd
> +nvea_grctx_init_mthd[] = {
> +       { 0xa297, nve4_grctx_init_a097, },
> +       { 0x902d, nvc0_grctx_init_902d, },
> +       { 0x902d, nvc0_grctx_init_mthd_magic, },
> +       {}
> +};
> +
> +struct nouveau_oclass *
> +nvea_grctx_oclass = &(struct nvc0_grctx_oclass) {
> +       .base.handle = NV_ENGCTX(GR, 0xea),
> +       .base.ofuncs = &(struct nouveau_ofuncs) {
> +               .ctor = nvc0_graph_context_ctor,
> +               .dtor = nvc0_graph_context_dtor,
> +               .init = _nouveau_graph_context_init,
> +               .fini = _nouveau_graph_context_fini,
> +               .rd32 = _nouveau_graph_context_rd32,
> +               .wr32 = _nouveau_graph_context_wr32,
> +       },
> +       .main = nve4_grctx_generate_main,
> +       .mods = nve4_grctx_generate_mods,
> +       .unkn = nve4_grctx_generate_unkn,
> +       .hub  = nve4_grctx_init_hub,
> +       .gpc  = nve4_grctx_init_gpc,
> +       .icmd = nve4_grctx_init_icmd,
> +       .mthd = nvea_grctx_init_mthd,
> +}.base;
> +
> +
> +struct nouveau_oclass *
> +nvea_graph_oclass = &(struct nvc0_graph_oclass) {
> +       .base.handle = NV_ENGINE(GR, 0xea),
> +       .base.ofuncs = &(struct nouveau_ofuncs) {
> +               .ctor = nvc0_graph_ctor,
> +               .dtor = nvc0_graph_dtor,
> +               .init = nve4_graph_init,
> +               .fini = _nouveau_graph_fini,
> +       },
> +       .cclass = &nvea_grctx_oclass,
> +       .sclass = nvea_graph_sclass,
> +       .mmio = nve4_graph_init_mmio,
> +}.base;
> diff --git a/drivers/gpu/drm/nouveau/core/include/engine/graph.h b/drivers/gpu/drm/nouveau/core/include/engine/graph.h
> index 97705618de97..632d7e172172 100644
> --- a/drivers/gpu/drm/nouveau/core/include/engine/graph.h
> +++ b/drivers/gpu/drm/nouveau/core/include/engine/graph.h
> @@ -68,6 +68,7 @@ extern struct nouveau_oclass *nvc8_graph_oclass;
>  extern struct nouveau_oclass *nvd7_graph_oclass;
>  extern struct nouveau_oclass *nvd9_graph_oclass;
>  extern struct nouveau_oclass *nve4_graph_oclass;
> +extern struct nouveau_oclass *nvea_graph_oclass;
>  extern struct nouveau_oclass *nvf0_graph_oclass;
>  extern struct nouveau_oclass *nv108_graph_oclass;
>
> --
> 1.9.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()
  2014-03-24 23:10         ` Thierry Reding
@ 2014-03-26  4:27           ` Ben Skeggs
  -1 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:27 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Alexandre Courbot, nouveau, linux-kernel, dri-devel, Ben Skeggs,
	linux-tegra

On Tue, Mar 25, 2014 at 9:10 AM, Thierry Reding
<thierry.reding@gmail.com> wrote:
> On Mon, Mar 24, 2014 at 05:42:33PM +0900, Alexandre Courbot wrote:
>> GK20A does not embed a dedicated COPY engine and thus cannot allocate
>> the copy channel that nouveau_accel_init() attempts to create. It also
>> lacks any display hardware, so the creation of a software channel does
>> not apply neither.
>
> Perhaps this should be two separate patches?
>
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
> [...]
>> +     if (device->chipset == 0xea) {
>> +             /* gk20a does not have CE0/CE1 */
>
> This would be another good candidate for a feature flag.
There are ways to query this in a chipset-independent way.  However,
despite reporting it as an error if no copy engines are available, the
code should continue on without the channel happily.  Perhaps we can
just punt the relevent error messages to a debug loglevel for now?

>
>> +             arg0 = NVE0_CHANNEL_IND_ENGINE_GR;
>> +             arg1 = 1;
>> +     } else
>>       if (device->card_type >= NV_E0) {
>
> The formatting here is somewhat weird. From a brief look I couldn't find
> any indication that nouveau deviates from the standard coding style, so
> this should be:
>
>         } else if (...) {
I use the former in a few places, despite it not entirely being
"correct".. It looks nicer though :)  I don't mind either way though.

>
>> +     /* Need to figure out how to handle sw for gk20a */
>> +     if (device->chipset == 0xea)
>> +             goto skip_sw_init;
>
> The commit message makes it sound like SW isn't needed since gk20a
> "lacks any display hardware". In that case the comment here doesn't make
> much sense.
>
> Thierry
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()
@ 2014-03-26  4:27           ` Ben Skeggs
  0 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:27 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Alexandre Courbot, Alexandre Courbot, nouveau, linux-kernel,
	dri-devel, Ben Skeggs, linux-tegra

On Tue, Mar 25, 2014 at 9:10 AM, Thierry Reding
<thierry.reding@gmail.com> wrote:
> On Mon, Mar 24, 2014 at 05:42:33PM +0900, Alexandre Courbot wrote:
>> GK20A does not embed a dedicated COPY engine and thus cannot allocate
>> the copy channel that nouveau_accel_init() attempts to create. It also
>> lacks any display hardware, so the creation of a software channel does
>> not apply neither.
>
> Perhaps this should be two separate patches?
>
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
> [...]
>> +     if (device->chipset == 0xea) {
>> +             /* gk20a does not have CE0/CE1 */
>
> This would be another good candidate for a feature flag.
There are ways to query this in a chipset-independent way.  However,
despite reporting it as an error if no copy engines are available, the
code should continue on without the channel happily.  Perhaps we can
just punt the relevent error messages to a debug loglevel for now?

>
>> +             arg0 = NVE0_CHANNEL_IND_ENGINE_GR;
>> +             arg1 = 1;
>> +     } else
>>       if (device->card_type >= NV_E0) {
>
> The formatting here is somewhat weird. From a brief look I couldn't find
> any indication that nouveau deviates from the standard coding style, so
> this should be:
>
>         } else if (...) {
I use the former in a few places, despite it not entirely being
"correct".. It looks nicer though :)  I don't mind either way though.

>
>> +     /* Need to figure out how to handle sw for gk20a */
>> +     if (device->chipset == 0xea)
>> +             goto skip_sw_init;
>
> The commit message makes it sound like SW isn't needed since gk20a
> "lacks any display hardware". In that case the comment here doesn't make
> much sense.
>
> Thierry
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Nouveau] [PATCH 12/12] drm/nouveau: support for probing GK20A
  2014-03-24  8:42     ` Alexandre Courbot
@ 2014-03-26  4:28       ` Ben Skeggs
  -1 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:28 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: linux-tegra, nouveau, Ben Skeggs, dri-devel, linux-kernel

On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
> Set the correct subdev/engine classes when GK20A (0xea) is probed.
>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/core/engine/device/nve0.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
> index 987edbc30a09..8509dd57de1f 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
> @@ -156,6 +156,26 @@ nve0_identify(struct nouveau_device *device)
>                 device->oclass[NVDEV_ENGINE_PPP    ] = &nvc0_ppp_oclass;
>                 device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass;
>                 break;
> +       case 0xea:
> +               device->cname = "GK20A";
> +               device->oclass[NVDEV_SUBDEV_MC     ] =  nvc3_mc_oclass;
> +               device->oclass[NVDEV_SUBDEV_BUS    ] =  nvc0_bus_oclass;
> +               device->oclass[NVDEV_SUBDEV_TIMER  ] = &nv04_timer_oclass;
As per note on the PTIMER patch, can just switch this to
"gk20a_timer_oclass" on the latest code.

> +               device->oclass[NVDEV_SUBDEV_FB     ] =  nvea_fb_oclass;
> +               device->oclass[NVDEV_SUBDEV_IBUS   ] = &nvea_ibus_oclass;
> +               device->oclass[NVDEV_SUBDEV_INSTMEM] = nv50_instmem_oclass;
> +               device->oclass[NVDEV_SUBDEV_VM     ] = &nvc0_vmmgr_oclass;
> +               device->oclass[NVDEV_SUBDEV_BAR    ] = &nvc0_bar_oclass;
> +               device->oclass[NVDEV_ENGINE_DMAOBJ ] = &nvd0_dmaeng_oclass;
> +               device->oclass[NVDEV_ENGINE_FIFO   ] =  nvea_fifo_oclass;
> +               /* TODO will need an implementation for this at some point... */
> +#if 0
> +               device->oclass[NVDEV_ENGINE_SW     ] =  nvc0_software_oclass;
> +#endif
> +               device->oclass[NVDEV_ENGINE_GR     ] =  nvea_graph_oclass;
> +               device->oclass[NVDEV_ENGINE_COPY2  ] = &nve0_copy2_oclass;
> +               device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass;
> +               break;
>         case 0xf0:
>                 device->cname = "GK110";
>                 device->oclass[NVDEV_SUBDEV_VBIOS  ] = &nouveau_bios_oclass;
> --
> 1.9.1
>
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Nouveau] [PATCH 12/12] drm/nouveau: support for probing GK20A
@ 2014-03-26  4:28       ` Ben Skeggs
  0 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-03-26  4:28 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau, linux-kernel, dri-devel, linux-tegra

On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
> Set the correct subdev/engine classes when GK20A (0xea) is probed.
>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/drm/nouveau/core/engine/device/nve0.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
> index 987edbc30a09..8509dd57de1f 100644
> --- a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
> +++ b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
> @@ -156,6 +156,26 @@ nve0_identify(struct nouveau_device *device)
>                 device->oclass[NVDEV_ENGINE_PPP    ] = &nvc0_ppp_oclass;
>                 device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass;
>                 break;
> +       case 0xea:
> +               device->cname = "GK20A";
> +               device->oclass[NVDEV_SUBDEV_MC     ] =  nvc3_mc_oclass;
> +               device->oclass[NVDEV_SUBDEV_BUS    ] =  nvc0_bus_oclass;
> +               device->oclass[NVDEV_SUBDEV_TIMER  ] = &nv04_timer_oclass;
As per note on the PTIMER patch, can just switch this to
"gk20a_timer_oclass" on the latest code.

> +               device->oclass[NVDEV_SUBDEV_FB     ] =  nvea_fb_oclass;
> +               device->oclass[NVDEV_SUBDEV_IBUS   ] = &nvea_ibus_oclass;
> +               device->oclass[NVDEV_SUBDEV_INSTMEM] = nv50_instmem_oclass;
> +               device->oclass[NVDEV_SUBDEV_VM     ] = &nvc0_vmmgr_oclass;
> +               device->oclass[NVDEV_SUBDEV_BAR    ] = &nvc0_bar_oclass;
> +               device->oclass[NVDEV_ENGINE_DMAOBJ ] = &nvd0_dmaeng_oclass;
> +               device->oclass[NVDEV_ENGINE_FIFO   ] =  nvea_fifo_oclass;
> +               /* TODO will need an implementation for this at some point... */
> +#if 0
> +               device->oclass[NVDEV_ENGINE_SW     ] =  nvc0_software_oclass;
> +#endif
> +               device->oclass[NVDEV_ENGINE_GR     ] =  nvea_graph_oclass;
> +               device->oclass[NVDEV_ENGINE_COPY2  ] = &nve0_copy2_oclass;
> +               device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass;
> +               break;
>         case 0xf0:
>                 device->cname = "GK110";
>                 device->oclass[NVDEV_SUBDEV_VBIOS  ] = &nouveau_bios_oclass;
> --
> 1.9.1
>
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 00/12] drm/nouveau: support for GK20A, cont'd
  2014-03-24 13:19 ` [PATCH 00/12] drm/nouveau: support for GK20A, cont'd Lucas Stach
@ 2014-03-26  6:33       ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-26  6:33 UTC (permalink / raw)
  To: Lucas Stach
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Linux Kernel Mailing List,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

Hi Lucas,

On Mon, Mar 24, 2014 at 10:19 PM, Lucas Stach <l.stach-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org> wrote:
> Hi Alexandre,
>
> Am Montag, den 24.03.2014, 17:42 +0900 schrieb Alexandre Courbot:
>> Hi everyone,
> [...]
>>
>> A few lines of hacks (not included here) are still needed to deal with cached
>> mappings triggering external aborts and CPU/GPU memory coherency issues, but I
>> hope to understand and address these issues next.
>
> For the coherency issue part you may want to look at my Nouveau on ARM
> series. Most of it never made it upstream, as I lacked the time to work
> further on this, but it solves the coherency issue from the kernel.

Oh, thanks for pointing this out, it will probably be most useful.
Shall I assume the patches at
https://www.mail-archive.com/nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org/msg13557.html
are up-to-date? Would you mind if I include the relevant patches of
yours in the next iteration of this series?

>
> It does so by doing the necessary manual cache flushes/invalidates on
> buffer access, so costs some performance. To avoid this you really want
> to get writecombined mappings into the kernel<->userspace interface.
> Simply mapping the pushbuf as WC/US has brought a 7% performance
> increase in OpenArena when I last tested this. This test was done with
> only one PCIe lane, so the perf increase may be even better with a more
> adequate interconnect.

Interestingly if I allow writecombined mappings in the kernel I get
faults when attempting the read the mapped area:

[   78.074854] Unhandled fault: external abort on non-linefetch
(0x1008) at 0xf003e010
...
[   78.337862] [<c03491a8>] (nouveau_bo_rd32) from [<c0346374>]
(nouveau_fence_update+0x5c/0x80)
[   78.352536] [<c0346374>] (nouveau_fence_update) from [<c03463b0>]
(nouveau_fence_done+0x18/0x28)
[   78.367531] [<c03463b0>] (nouveau_fence_done) from [<c02b852c>]
(ttm_bo_wait+0x104/0x184)
[   78.381915] [<c02b852c>] (ttm_bo_wait) from [<c034c718>]
(nouveau_gem_ioctl_cpu_prep+0x40/0xe8)
[   78.396849] [<c034c718>] (nouveau_gem_ioctl_cpu_prep) from
[<c029fd5c>] (drm_ioctl+0x404/0x4b8)
[   78.411790] [<c029fd5c>] (drm_ioctl) from [<c0343960>]
(nouveau_drm_ioctl+0x54/0x80)
[   78.425805] [<c0343960>] (nouveau_drm_ioctl) from [<c00ea5ec>]
(do_vfs_ioctl+0x3f0/0x5bc)
[   78.440277] [<c00ea5ec>] (do_vfs_ioctl) from [<c00ea7ec>]
(SyS_ioctl+0x34/0x5c)
[   78.453918] [<c00ea7ec>] (SyS_ioctl) from [<c000e5a0>]
(ret_fast_syscall+0x0/0x30)

To avoid these I need to set the VRAM default_caching to
TTM_PL_FLAG_UNCACHED. It is not clear to me why this is needed. The BO
being accessed through the BAR, they are correctly considered as IO
memory and mapped using ttm_bo_ioremap(), so it really seems to be
unhappy with the WC mapping itself.

Note that if I go ahead and force the use of pgprot_writecombine() in
ttm_io_prot() to get writecombined user-space mappings, pure DRM
programs that map a buffer and try to read it fail similarly, while
Mesa's glReadPixels() seems to be happy. I'm not sure what it does
differently here.

Cheers,
Alex.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 00/12] drm/nouveau: support for GK20A, cont'd
@ 2014-03-26  6:33       ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-26  6:33 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Alexandre Courbot, Ben Skeggs, nouveau,
	Linux Kernel Mailing List, dri-devel, linux-tegra

Hi Lucas,

On Mon, Mar 24, 2014 at 10:19 PM, Lucas Stach <l.stach@pengutronix.de> wrote:
> Hi Alexandre,
>
> Am Montag, den 24.03.2014, 17:42 +0900 schrieb Alexandre Courbot:
>> Hi everyone,
> [...]
>>
>> A few lines of hacks (not included here) are still needed to deal with cached
>> mappings triggering external aborts and CPU/GPU memory coherency issues, but I
>> hope to understand and address these issues next.
>
> For the coherency issue part you may want to look at my Nouveau on ARM
> series. Most of it never made it upstream, as I lacked the time to work
> further on this, but it solves the coherency issue from the kernel.

Oh, thanks for pointing this out, it will probably be most useful.
Shall I assume the patches at
https://www.mail-archive.com/nouveau@lists.freedesktop.org/msg13557.html
are up-to-date? Would you mind if I include the relevant patches of
yours in the next iteration of this series?

>
> It does so by doing the necessary manual cache flushes/invalidates on
> buffer access, so costs some performance. To avoid this you really want
> to get writecombined mappings into the kernel<->userspace interface.
> Simply mapping the pushbuf as WC/US has brought a 7% performance
> increase in OpenArena when I last tested this. This test was done with
> only one PCIe lane, so the perf increase may be even better with a more
> adequate interconnect.

Interestingly if I allow writecombined mappings in the kernel I get
faults when attempting the read the mapped area:

[   78.074854] Unhandled fault: external abort on non-linefetch
(0x1008) at 0xf003e010
...
[   78.337862] [<c03491a8>] (nouveau_bo_rd32) from [<c0346374>]
(nouveau_fence_update+0x5c/0x80)
[   78.352536] [<c0346374>] (nouveau_fence_update) from [<c03463b0>]
(nouveau_fence_done+0x18/0x28)
[   78.367531] [<c03463b0>] (nouveau_fence_done) from [<c02b852c>]
(ttm_bo_wait+0x104/0x184)
[   78.381915] [<c02b852c>] (ttm_bo_wait) from [<c034c718>]
(nouveau_gem_ioctl_cpu_prep+0x40/0xe8)
[   78.396849] [<c034c718>] (nouveau_gem_ioctl_cpu_prep) from
[<c029fd5c>] (drm_ioctl+0x404/0x4b8)
[   78.411790] [<c029fd5c>] (drm_ioctl) from [<c0343960>]
(nouveau_drm_ioctl+0x54/0x80)
[   78.425805] [<c0343960>] (nouveau_drm_ioctl) from [<c00ea5ec>]
(do_vfs_ioctl+0x3f0/0x5bc)
[   78.440277] [<c00ea5ec>] (do_vfs_ioctl) from [<c00ea7ec>]
(SyS_ioctl+0x34/0x5c)
[   78.453918] [<c00ea7ec>] (SyS_ioctl) from [<c000e5a0>]
(ret_fast_syscall+0x0/0x30)

To avoid these I need to set the VRAM default_caching to
TTM_PL_FLAG_UNCACHED. It is not clear to me why this is needed. The BO
being accessed through the BAR, they are correctly considered as IO
memory and mapped using ttm_bo_ioremap(), so it really seems to be
unhappy with the WC mapping itself.

Note that if I go ahead and force the use of pgprot_writecombine() in
ttm_io_prot() to get writecombined user-space mappings, pure DRM
programs that map a buffer and try to read it fail similarly, while
Mesa's glReadPixels() seems to be happy. I'm not sure what it does
differently here.

Cheers,
Alex.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 00/12] drm/nouveau: support for GK20A, cont'd
  2014-03-26  6:33       ` Alexandre Courbot
@ 2014-03-26 10:33           ` Lucas Stach
  -1 siblings, 0 replies; 100+ messages in thread
From: Lucas Stach @ 2014-03-26 10:33 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Linux Kernel Mailing List,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

Hi Alexandre,

Am Mittwoch, den 26.03.2014, 15:33 +0900 schrieb Alexandre Courbot:
> Hi Lucas,
> 
> On Mon, Mar 24, 2014 at 10:19 PM, Lucas Stach <l.stach-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org> wrote:
> > Hi Alexandre,
> >
> > Am Montag, den 24.03.2014, 17:42 +0900 schrieb Alexandre Courbot:
> >> Hi everyone,
> > [...]
> >>
> >> A few lines of hacks (not included here) are still needed to deal with cached
> >> mappings triggering external aborts and CPU/GPU memory coherency issues, but I
> >> hope to understand and address these issues next.
> >
> > For the coherency issue part you may want to look at my Nouveau on ARM
> > series. Most of it never made it upstream, as I lacked the time to work
> > further on this, but it solves the coherency issue from the kernel.
> 
> Oh, thanks for pointing this out, it will probably be most useful.
> Shall I assume the patches at
> https://www.mail-archive.com/nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org/msg13557.html
> are up-to-date? Would you mind if I include the relevant patches of
> yours in the next iteration of this series?
> 
> >
> > It does so by doing the necessary manual cache flushes/invalidates on
> > buffer access, so costs some performance. To avoid this you really want
> > to get writecombined mappings into the kernel<->userspace interface.
> > Simply mapping the pushbuf as WC/US has brought a 7% performance
> > increase in OpenArena when I last tested this. This test was done with
> > only one PCIe lane, so the perf increase may be even better with a more
> > adequate interconnect.
> 
> Interestingly if I allow writecombined mappings in the kernel I get
> faults when attempting the read the mapped area:
> 
This is most likely because your handling of those buffers produces
conflicting mappings (if my understanding of what you are doing is
right).

At first you allocate memory from CMA without changing the pgprot flags.
This yields pages which are mapped uncached or cached (when moveable
pages are purged from CMA to make space for your buffer) into the
kernels linear space.

Later you regard this memory as iomem (it isn't!) and let TTM remap
those pages into the vmalloc area with pgprot set to writecombined.

I don't know exactly why this is causing havoc, but having two
conflicting virtual mappings of the same physical memory is documented
to at least produce undefined behavior on ARMv7.

Regards,
Lucas

> [   78.074854] Unhandled fault: external abort on non-linefetch
> (0x1008) at 0xf003e010
> ...
> [   78.337862] [<c03491a8>] (nouveau_bo_rd32) from [<c0346374>]
> (nouveau_fence_update+0x5c/0x80)
> [   78.352536] [<c0346374>] (nouveau_fence_update) from [<c03463b0>]
> (nouveau_fence_done+0x18/0x28)
> [   78.367531] [<c03463b0>] (nouveau_fence_done) from [<c02b852c>]
> (ttm_bo_wait+0x104/0x184)
> [   78.381915] [<c02b852c>] (ttm_bo_wait) from [<c034c718>]
> (nouveau_gem_ioctl_cpu_prep+0x40/0xe8)
> [   78.396849] [<c034c718>] (nouveau_gem_ioctl_cpu_prep) from
> [<c029fd5c>] (drm_ioctl+0x404/0x4b8)
> [   78.411790] [<c029fd5c>] (drm_ioctl) from [<c0343960>]
> (nouveau_drm_ioctl+0x54/0x80)
> [   78.425805] [<c0343960>] (nouveau_drm_ioctl) from [<c00ea5ec>]
> (do_vfs_ioctl+0x3f0/0x5bc)
> [   78.440277] [<c00ea5ec>] (do_vfs_ioctl) from [<c00ea7ec>]
> (SyS_ioctl+0x34/0x5c)
> [   78.453918] [<c00ea7ec>] (SyS_ioctl) from [<c000e5a0>]
> (ret_fast_syscall+0x0/0x30)
> 
> To avoid these I need to set the VRAM default_caching to
> TTM_PL_FLAG_UNCACHED. It is not clear to me why this is needed. The BO
> being accessed through the BAR, they are correctly considered as IO
> memory and mapped using ttm_bo_ioremap(), so it really seems to be
> unhappy with the WC mapping itself.
> 
> Note that if I go ahead and force the use of pgprot_writecombine() in
> ttm_io_prot() to get writecombined user-space mappings, pure DRM
> programs that map a buffer and try to read it fail similarly, while
> Mesa's glReadPixels() seems to be happy. I'm not sure what it does
> differently here.
> 
> Cheers,
> Alex.

-- 
Pengutronix e.K.                           | Lucas Stach                 |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-5076 |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 00/12] drm/nouveau: support for GK20A, cont'd
@ 2014-03-26 10:33           ` Lucas Stach
  0 siblings, 0 replies; 100+ messages in thread
From: Lucas Stach @ 2014-03-26 10:33 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Alexandre Courbot, Ben Skeggs, nouveau,
	Linux Kernel Mailing List, dri-devel, linux-tegra

Hi Alexandre,

Am Mittwoch, den 26.03.2014, 15:33 +0900 schrieb Alexandre Courbot:
> Hi Lucas,
> 
> On Mon, Mar 24, 2014 at 10:19 PM, Lucas Stach <l.stach@pengutronix.de> wrote:
> > Hi Alexandre,
> >
> > Am Montag, den 24.03.2014, 17:42 +0900 schrieb Alexandre Courbot:
> >> Hi everyone,
> > [...]
> >>
> >> A few lines of hacks (not included here) are still needed to deal with cached
> >> mappings triggering external aborts and CPU/GPU memory coherency issues, but I
> >> hope to understand and address these issues next.
> >
> > For the coherency issue part you may want to look at my Nouveau on ARM
> > series. Most of it never made it upstream, as I lacked the time to work
> > further on this, but it solves the coherency issue from the kernel.
> 
> Oh, thanks for pointing this out, it will probably be most useful.
> Shall I assume the patches at
> https://www.mail-archive.com/nouveau@lists.freedesktop.org/msg13557.html
> are up-to-date? Would you mind if I include the relevant patches of
> yours in the next iteration of this series?
> 
> >
> > It does so by doing the necessary manual cache flushes/invalidates on
> > buffer access, so costs some performance. To avoid this you really want
> > to get writecombined mappings into the kernel<->userspace interface.
> > Simply mapping the pushbuf as WC/US has brought a 7% performance
> > increase in OpenArena when I last tested this. This test was done with
> > only one PCIe lane, so the perf increase may be even better with a more
> > adequate interconnect.
> 
> Interestingly if I allow writecombined mappings in the kernel I get
> faults when attempting the read the mapped area:
> 
This is most likely because your handling of those buffers produces
conflicting mappings (if my understanding of what you are doing is
right).

At first you allocate memory from CMA without changing the pgprot flags.
This yields pages which are mapped uncached or cached (when moveable
pages are purged from CMA to make space for your buffer) into the
kernels linear space.

Later you regard this memory as iomem (it isn't!) and let TTM remap
those pages into the vmalloc area with pgprot set to writecombined.

I don't know exactly why this is causing havoc, but having two
conflicting virtual mappings of the same physical memory is documented
to at least produce undefined behavior on ARMv7.

Regards,
Lucas

> [   78.074854] Unhandled fault: external abort on non-linefetch
> (0x1008) at 0xf003e010
> ...
> [   78.337862] [<c03491a8>] (nouveau_bo_rd32) from [<c0346374>]
> (nouveau_fence_update+0x5c/0x80)
> [   78.352536] [<c0346374>] (nouveau_fence_update) from [<c03463b0>]
> (nouveau_fence_done+0x18/0x28)
> [   78.367531] [<c03463b0>] (nouveau_fence_done) from [<c02b852c>]
> (ttm_bo_wait+0x104/0x184)
> [   78.381915] [<c02b852c>] (ttm_bo_wait) from [<c034c718>]
> (nouveau_gem_ioctl_cpu_prep+0x40/0xe8)
> [   78.396849] [<c034c718>] (nouveau_gem_ioctl_cpu_prep) from
> [<c029fd5c>] (drm_ioctl+0x404/0x4b8)
> [   78.411790] [<c029fd5c>] (drm_ioctl) from [<c0343960>]
> (nouveau_drm_ioctl+0x54/0x80)
> [   78.425805] [<c0343960>] (nouveau_drm_ioctl) from [<c00ea5ec>]
> (do_vfs_ioctl+0x3f0/0x5bc)
> [   78.440277] [<c00ea5ec>] (do_vfs_ioctl) from [<c00ea7ec>]
> (SyS_ioctl+0x34/0x5c)
> [   78.453918] [<c00ea7ec>] (SyS_ioctl) from [<c000e5a0>]
> (ret_fast_syscall+0x0/0x30)
> 
> To avoid these I need to set the VRAM default_caching to
> TTM_PL_FLAG_UNCACHED. It is not clear to me why this is needed. The BO
> being accessed through the BAR, they are correctly considered as IO
> memory and mapped using ttm_bo_ioremap(), so it really seems to be
> unhappy with the WC mapping itself.
> 
> Note that if I go ahead and force the use of pgprot_writecombine() in
> ttm_io_prot() to get writecombined user-space mappings, pure DRM
> programs that map a buffer and try to read it fail similarly, while
> Mesa's glReadPixels() seems to be happy. I'm not sure what it does
> differently here.
> 
> Cheers,
> Alex.

-- 
Pengutronix e.K.                           | Lucas Stach                 |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-5076 |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |


^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 00/12] drm/nouveau: support for GK20A, cont'd
  2014-03-26 10:33           ` Lucas Stach
@ 2014-03-27  3:50               ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-27  3:50 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Alexandre Courbot, Ben Skeggs,
	nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Linux Kernel Mailing List,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Wed, Mar 26, 2014 at 7:33 PM, Lucas Stach <l.stach-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org> wrote:
>> > It does so by doing the necessary manual cache flushes/invalidates on
>> > buffer access, so costs some performance. To avoid this you really want
>> > to get writecombined mappings into the kernel<->userspace interface.
>> > Simply mapping the pushbuf as WC/US has brought a 7% performance
>> > increase in OpenArena when I last tested this. This test was done with
>> > only one PCIe lane, so the perf increase may be even better with a more
>> > adequate interconnect.
>>
>> Interestingly if I allow writecombined mappings in the kernel I get
>> faults when attempting the read the mapped area:
>>
> This is most likely because your handling of those buffers produces
> conflicting mappings (if my understanding of what you are doing is
> right).
>
> At first you allocate memory from CMA without changing the pgprot flags.
> This yields pages which are mapped uncached or cached (when moveable
> pages are purged from CMA to make space for your buffer) into the
> kernels linear space.
>
> Later you regard this memory as iomem (it isn't!) and let TTM remap
> those pages into the vmalloc area with pgprot set to writecombined.
>
> I don't know exactly why this is causing havoc, but having two
> conflicting virtual mappings of the same physical memory is documented
> to at least produce undefined behavior on ARMv7.

IIUC this is not exactly what happens with GK20A, so let me explain
how VRAM is currently accessed to make sure we are in sync.

VRAM pages are allocated by nvea_ram_get(), which allocates chunks of
contiguous memory using dma_alloc_from_contiguous(). At that time I
don't think the pages are mapped anywhere for the CPU to see (contrary
to dma_alloc_coherent() for instance). Nouveau will then map the
memory into the GPU context's address space, but it is only when
nouveau_ttm_io_mem_reserve() is called that a BAR mapping is created,
making the memory accessible to the CPU through the BAR window (which
I consider as I/O memory).

The area of the BAR window pointing to the VRAM is then mapped to the
kernel (using ioremap_wc() or ioremap_nocache()) or user-space (where
ttm_io_prot() is called to get the pgprot_t to use). It is when this
mapping is writecombined that I get the faults.

So as far as I can tell, only at most one CPU mapping exists at any
time for VRAM memory, which goes through the BAR to access the actual
physical memory. It would probably be faster and more logical to map
the RAM directly so the CPU can address it, but going through the BAR
reduces CPU/GPU synchronization issues and there are a few cases where
we would need to map through the BAR anyway (e.g. tiled memory to be
made linear for the CPU).

I don't know if that help understanding what the issue might be - I
just wanted to make sure we are talking about the same thing. :)

Thanks,
Alex.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 00/12] drm/nouveau: support for GK20A, cont'd
@ 2014-03-27  3:50               ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-03-27  3:50 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Alexandre Courbot, Ben Skeggs, nouveau,
	Linux Kernel Mailing List, dri-devel, linux-tegra

On Wed, Mar 26, 2014 at 7:33 PM, Lucas Stach <l.stach@pengutronix.de> wrote:
>> > It does so by doing the necessary manual cache flushes/invalidates on
>> > buffer access, so costs some performance. To avoid this you really want
>> > to get writecombined mappings into the kernel<->userspace interface.
>> > Simply mapping the pushbuf as WC/US has brought a 7% performance
>> > increase in OpenArena when I last tested this. This test was done with
>> > only one PCIe lane, so the perf increase may be even better with a more
>> > adequate interconnect.
>>
>> Interestingly if I allow writecombined mappings in the kernel I get
>> faults when attempting the read the mapped area:
>>
> This is most likely because your handling of those buffers produces
> conflicting mappings (if my understanding of what you are doing is
> right).
>
> At first you allocate memory from CMA without changing the pgprot flags.
> This yields pages which are mapped uncached or cached (when moveable
> pages are purged from CMA to make space for your buffer) into the
> kernels linear space.
>
> Later you regard this memory as iomem (it isn't!) and let TTM remap
> those pages into the vmalloc area with pgprot set to writecombined.
>
> I don't know exactly why this is causing havoc, but having two
> conflicting virtual mappings of the same physical memory is documented
> to at least produce undefined behavior on ARMv7.

IIUC this is not exactly what happens with GK20A, so let me explain
how VRAM is currently accessed to make sure we are in sync.

VRAM pages are allocated by nvea_ram_get(), which allocates chunks of
contiguous memory using dma_alloc_from_contiguous(). At that time I
don't think the pages are mapped anywhere for the CPU to see (contrary
to dma_alloc_coherent() for instance). Nouveau will then map the
memory into the GPU context's address space, but it is only when
nouveau_ttm_io_mem_reserve() is called that a BAR mapping is created,
making the memory accessible to the CPU through the BAR window (which
I consider as I/O memory).

The area of the BAR window pointing to the VRAM is then mapped to the
kernel (using ioremap_wc() or ioremap_nocache()) or user-space (where
ttm_io_prot() is called to get the pgprot_t to use). It is when this
mapping is writecombined that I get the faults.

So as far as I can tell, only at most one CPU mapping exists at any
time for VRAM memory, which goes through the BAR to access the actual
physical memory. It would probably be faster and more logical to map
the RAM directly so the CPU can address it, but going through the BAR
reduces CPU/GPU synchronization issues and there are a few cases where
we would need to map through the BAR anyway (e.g. tiled memory to be
made linear for the CPU).

I don't know if that help understanding what the issue might be - I
just wanted to make sure we are talking about the same thing. :)

Thanks,
Alex.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 04/12] drm/nouveau/bar/nvc0: support chips without BAR3
  2014-03-24 22:10       ` Thierry Reding
@ 2014-04-02 13:47         ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 13:47 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Alexandre Courbot, Ben Skeggs,
	nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Linux Kernel Mailing List

On Tue, Mar 25, 2014 at 7:10 AM, Thierry Reding
<thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, Mar 24, 2014 at 05:42:26PM +0900, Alexandre Courbot wrote:
> [...]
>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
> [...]
>>  static int
>> -nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
>> -           struct nouveau_oclass *oclass, void *data, u32 size,
>> -           struct nouveau_object **pobject)
>> +nvc0_bar_init_vm(struct nvc0_bar_priv *priv, int nr, int bar)
>>  {
> [...]
>> -     /* BAR3 */
>>       ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0,
>> -                             &priv->bar[0].mem);
>> -     mem = priv->bar[0].mem;
>> +                             &priv->bar[nr].mem);
>> +     mem = priv->bar[nr].mem;
>>       if (ret)
>>               return ret;
>>
>>       ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0,
>> -                             &priv->bar[0].pgd);
>> +                             &priv->bar[nr].pgd);
>>       if (ret)
>>               return ret;
> [...]
>> +static int
>> +nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
>> +           struct nouveau_oclass *oclass, void *data, u32 size,
>> +           struct nouveau_object **pobject)
>> +{
> [...]
>> +     /* BAR3 */
>> +     if (has_bar3) {
>> +             ret = nvc0_bar_init_vm(priv, 0, 3);
> [...]
>> +     /* BAR1 */
>> +     ret = nvc0_bar_init_vm(priv, 1, 1);
>>       if (ret)
>>               return ret;
>
> The calls to nvc0_bar_init_vm() are somewhat confusing in my opinion. It
> is hard to see from the invocation what these numbers mean and therefore
> distinguish which parameter is which.
>
> Perhaps a slightly more readable way would be to pass in a pointer to a
> structure as second parameter instead of the index into an array. So
> it'd look somewhat like this:
>
>         if (has_bar3) {
>                 ret = nvc0_bar_init_vm(priv, &priv->bar[0], 3);
>                 ...
>         }
>         ...
>         ret = nvc0_bar_init_vm(priv, &priv->bar[1], 1);
>         ...
>
> Unfortunately that would require a new type to be created for the bar[]
> structures, so it'd be slightly more intrusive.

These types are local to nvc0.c anyway, so I don't think it would
hurt. And you are right that the code would become more readable as a
result, passing array indexes as arguments is not a common practice
(and should not be).

Alex.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 04/12] drm/nouveau/bar/nvc0: support chips without BAR3
@ 2014-04-02 13:47         ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 13:47 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Alexandre Courbot, Ben Skeggs, nouveau, dri-devel, linux-tegra,
	Linux Kernel Mailing List

On Tue, Mar 25, 2014 at 7:10 AM, Thierry Reding
<thierry.reding@gmail.com> wrote:
> On Mon, Mar 24, 2014 at 05:42:26PM +0900, Alexandre Courbot wrote:
> [...]
>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c b/drivers/gpu/drm/nouveau/core/subdev/bar/nvc0.c
> [...]
>>  static int
>> -nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
>> -           struct nouveau_oclass *oclass, void *data, u32 size,
>> -           struct nouveau_object **pobject)
>> +nvc0_bar_init_vm(struct nvc0_bar_priv *priv, int nr, int bar)
>>  {
> [...]
>> -     /* BAR3 */
>>       ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x1000, 0, 0,
>> -                             &priv->bar[0].mem);
>> -     mem = priv->bar[0].mem;
>> +                             &priv->bar[nr].mem);
>> +     mem = priv->bar[nr].mem;
>>       if (ret)
>>               return ret;
>>
>>       ret = nouveau_gpuobj_new(nv_object(priv), NULL, 0x8000, 0, 0,
>> -                             &priv->bar[0].pgd);
>> +                             &priv->bar[nr].pgd);
>>       if (ret)
>>               return ret;
> [...]
>> +static int
>> +nvc0_bar_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
>> +           struct nouveau_oclass *oclass, void *data, u32 size,
>> +           struct nouveau_object **pobject)
>> +{
> [...]
>> +     /* BAR3 */
>> +     if (has_bar3) {
>> +             ret = nvc0_bar_init_vm(priv, 0, 3);
> [...]
>> +     /* BAR1 */
>> +     ret = nvc0_bar_init_vm(priv, 1, 1);
>>       if (ret)
>>               return ret;
>
> The calls to nvc0_bar_init_vm() are somewhat confusing in my opinion. It
> is hard to see from the invocation what these numbers mean and therefore
> distinguish which parameter is which.
>
> Perhaps a slightly more readable way would be to pass in a pointer to a
> structure as second parameter instead of the index into an array. So
> it'd look somewhat like this:
>
>         if (has_bar3) {
>                 ret = nvc0_bar_init_vm(priv, &priv->bar[0], 3);
>                 ...
>         }
>         ...
>         ret = nvc0_bar_init_vm(priv, &priv->bar[1], 1);
>         ...
>
> Unfortunately that would require a new type to be created for the bar[]
> structures, so it'd be slightly more intrusive.

These types are local to nvc0.c anyway, so I don't think it would
hurt. And you are right that the code would become more readable as a
result, passing array indexes as arguments is not a common practice
(and should not be).

Alex.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 06/12] drm/nouveau/ibus: add GK20A support
  2014-03-24 22:34         ` Thierry Reding
@ 2014-04-02 13:52           ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 13:52 UTC (permalink / raw)
  To: Thierry Reding
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Linux Kernel Mailing List,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Tue, Mar 25, 2014 at 7:34 AM, Thierry Reding
<thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, Mar 24, 2014 at 05:42:28PM +0900, Alexandre Courbot wrote:
> [...]
>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
> [...]
>> +#include <subdev/ibus.h>
>> +
>> +struct nvea_ibus_priv {
>> +     struct nouveau_ibus base;
>> +};
>> +
>> +static void
>> +nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv)
>> +{
>> +     nv_mask(priv, 0x137250, 0x3f, 0);
>> +
>> +     nv_mask(priv, 0x000200, 0x20, 0);
>> +     udelay(20);
>
> usleep_range()?

Sure.

>
>> +static void
>> +nvea_ibus_intr(struct nouveau_subdev *subdev)
>> +{
> [...]
>> +     /* Acknowledge interrupt */
>> +     nv_mask(priv, 0x12004c, 0x2, 0x2);
>> +
>> +     while (--retry >= 0) {
>> +             command = nv_rd32(priv, 0x12004c) & 0x3f;
>> +             if (command == 0)
>> +                     break;
>> +     }
>> +
>> +     if (retry < 0)
>> +             nv_warn(priv, "timeout waiting for ringmaster ack\n");
>> +}
>
> Perhaps I'm being paranoid, but this loop now depends on the frequency
> of the various clocks involved and therefore might break at some point
> if the frequencies get sufficiently high.
>
> So a slightly safer implementation would use a proper timeout using a
> combination of msecs_to_jiffies(), time_before() and usleep_range(),
> like so:
>
>         timeout = jiffies + msecs_to_jiffies(...);
>
>         while (time_before(jiffies, timeout)) {
>                 command = nv_rd32(...) & 0x3f;
>                 if (command == 0)
>                         break;
>
>                 usleep_range(...);
>         }
>
>         if (time_after(jiffies, timeout))
>                 nv_warn(...);

Right, now that I look at this code again I don't even understand why
I left it this way. Maybe I left some early test code slip into the
final patch, sorry about that.

> This assumes that there's some known timeout after which the ringmaster
> is expected to have acked the interrupt. On that note, I wonder if the
> warning is accurate here: it's my understanding that writing 0x2 to the
> register does acknowledge the interrupt, so the ringmaster does in fact
> "clear" it rather than "acknowledge" it, doesn't it?
>
> Although now that I mention it I seem to remember that this write is
> actually sending a command to the ring master and perhaps waiting for
> the register to return to 0 is indeed waiting for an ACK of sorts. Maybe
> adding a comment or so describing what this sequence does would be
> appropriate here?

Can we from an IP point of view? AFAIK this sequence has never been
publicly documented.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 06/12] drm/nouveau/ibus: add GK20A support
@ 2014-04-02 13:52           ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 13:52 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Alexandre Courbot, Ben Skeggs, nouveau, dri-devel, linux-tegra,
	Linux Kernel Mailing List

On Tue, Mar 25, 2014 at 7:34 AM, Thierry Reding
<thierry.reding@gmail.com> wrote:
> On Mon, Mar 24, 2014 at 05:42:28PM +0900, Alexandre Courbot wrote:
> [...]
>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
> [...]
>> +#include <subdev/ibus.h>
>> +
>> +struct nvea_ibus_priv {
>> +     struct nouveau_ibus base;
>> +};
>> +
>> +static void
>> +nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv)
>> +{
>> +     nv_mask(priv, 0x137250, 0x3f, 0);
>> +
>> +     nv_mask(priv, 0x000200, 0x20, 0);
>> +     udelay(20);
>
> usleep_range()?

Sure.

>
>> +static void
>> +nvea_ibus_intr(struct nouveau_subdev *subdev)
>> +{
> [...]
>> +     /* Acknowledge interrupt */
>> +     nv_mask(priv, 0x12004c, 0x2, 0x2);
>> +
>> +     while (--retry >= 0) {
>> +             command = nv_rd32(priv, 0x12004c) & 0x3f;
>> +             if (command == 0)
>> +                     break;
>> +     }
>> +
>> +     if (retry < 0)
>> +             nv_warn(priv, "timeout waiting for ringmaster ack\n");
>> +}
>
> Perhaps I'm being paranoid, but this loop now depends on the frequency
> of the various clocks involved and therefore might break at some point
> if the frequencies get sufficiently high.
>
> So a slightly safer implementation would use a proper timeout using a
> combination of msecs_to_jiffies(), time_before() and usleep_range(),
> like so:
>
>         timeout = jiffies + msecs_to_jiffies(...);
>
>         while (time_before(jiffies, timeout)) {
>                 command = nv_rd32(...) & 0x3f;
>                 if (command == 0)
>                         break;
>
>                 usleep_range(...);
>         }
>
>         if (time_after(jiffies, timeout))
>                 nv_warn(...);

Right, now that I look at this code again I don't even understand why
I left it this way. Maybe I left some early test code slip into the
final patch, sorry about that.

> This assumes that there's some known timeout after which the ringmaster
> is expected to have acked the interrupt. On that note, I wonder if the
> warning is accurate here: it's my understanding that writing 0x2 to the
> register does acknowledge the interrupt, so the ringmaster does in fact
> "clear" it rather than "acknowledge" it, doesn't it?
>
> Although now that I mention it I seem to remember that this write is
> actually sending a command to the ring master and perhaps waiting for
> the register to return to 0 is indeed waiting for an ACK of sorts. Maybe
> adding a comment or so describing what this sequence does would be
> appropriate here?

Can we from an IP point of view? AFAIK this sequence has never been
publicly documented.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 08/12] drm/nouveau/graph: enable when using external firmware
  2014-03-26  4:21           ` Ben Skeggs
@ 2014-04-02 13:53             ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 13:53 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau, linux-kernel, dri-devel, Ben Skeggs, linux-tegra

On Wed, Mar 26, 2014 at 1:21 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
> On Tue, Mar 25, 2014 at 8:58 AM, Thierry Reding
> <thierry.reding@gmail.com> wrote:
>> On Mon, Mar 24, 2014 at 05:42:30PM +0900, Alexandre Courbot wrote:
>> [...]
>>> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>>> index 6ef8bf181b2d..f997a18f5760 100644
>>> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>>> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>>> @@ -1133,10 +1133,14 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
>>>       struct nvc0_graph_oclass *oclass = (void *)bclass;
>>>       struct nouveau_device *device = nv_device(parent);
>>>       struct nvc0_graph_priv *priv;
>>> +     bool use_fw;
>>
>> Perhaps "ext_fw" or "use_ext_fw" would be more accurate.
>>
>>>       int ret, i;
>>>
>>> +     use_fw = nouveau_boolopt(device->cfgopt, "NvGrUseFW", false);
>>> +
>>>       ret = nouveau_graph_create(parent, engine, bclass,
>>> -                                (oclass->fecs.ucode != NULL), &priv);
>>> +                                (oclass->fecs.ucode != NULL) || use_fw,
>>> +                                &priv);
>>
>> Or perhaps a more intuitive way would be to name the variable "enable"
>> and have something like:
>>
>>         if (!nouveau_boolopt(device->cfgopt, "NvGrUseFW", false))
>>                 enable = oclass->fecs.ucode != NULL;
>>         else
>>                 enable = true;
>>
>>         ret = nouveau_graph_create(parent, engine, bclass, enable, &priv);
>>         ...
> Agreed, this looks a lot nicer.

Will fix that, thanks!

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 08/12] drm/nouveau/graph: enable when using external firmware
@ 2014-04-02 13:53             ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 13:53 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, Alexandre Courbot, nouveau, linux-kernel,
	dri-devel, Ben Skeggs, linux-tegra

On Wed, Mar 26, 2014 at 1:21 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
> On Tue, Mar 25, 2014 at 8:58 AM, Thierry Reding
> <thierry.reding@gmail.com> wrote:
>> On Mon, Mar 24, 2014 at 05:42:30PM +0900, Alexandre Courbot wrote:
>> [...]
>>> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>>> index 6ef8bf181b2d..f997a18f5760 100644
>>> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>>> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>>> @@ -1133,10 +1133,14 @@ nvc0_graph_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
>>>       struct nvc0_graph_oclass *oclass = (void *)bclass;
>>>       struct nouveau_device *device = nv_device(parent);
>>>       struct nvc0_graph_priv *priv;
>>> +     bool use_fw;
>>
>> Perhaps "ext_fw" or "use_ext_fw" would be more accurate.
>>
>>>       int ret, i;
>>>
>>> +     use_fw = nouveau_boolopt(device->cfgopt, "NvGrUseFW", false);
>>> +
>>>       ret = nouveau_graph_create(parent, engine, bclass,
>>> -                                (oclass->fecs.ucode != NULL), &priv);
>>> +                                (oclass->fecs.ucode != NULL) || use_fw,
>>> +                                &priv);
>>
>> Or perhaps a more intuitive way would be to name the variable "enable"
>> and have something like:
>>
>>         if (!nouveau_boolopt(device->cfgopt, "NvGrUseFW", false))
>>                 enable = oclass->fecs.ucode != NULL;
>>         else
>>                 enable = true;
>>
>>         ret = nouveau_graph_create(parent, engine, bclass, enable, &priv);
>>         ...
> Agreed, this looks a lot nicer.

Will fix that, thanks!

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Nouveau] [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time
  2014-03-26  4:22         ` [Nouveau] " Ben Skeggs
@ 2014-04-02 13:54             ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 13:54 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Alexandre Courbot, linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed, Mar 26, 2014 at 1:22 PM, Ben Skeggs <skeggsb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> wrote:
>> Pad the microcode to a multiple of 0x40, otherwise firmware will fail to
>> run from non-prepadded firmware files.
>>
>> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> ---
>>  drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>> index f997a18f5760..367e72daf8b1 100644
>> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>> @@ -768,6 +768,10 @@ nvc0_graph_init_fw(struct nvc0_graph_priv *priv, u32 fuc_base,
>>                         nv_wr32(priv, fuc_base + 0x0188, i >> 6);
>>                 nv_wr32(priv, fuc_base + 0x0184, code->data[i]);
>>         }
>> +
>> +       /* code must be padded to 0x40 */
>> +       for (; i < (((code->size / 4) + 0x3f) & ~0x3f); i++)
> "for (; i & 0x3f; i++)" would work just as well :)

Indeed. >_<

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Nouveau] [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time
@ 2014-04-02 13:54             ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 13:54 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Alexandre Courbot, linux-tegra, nouveau, Ben Skeggs, dri-devel,
	linux-kernel

On Wed, Mar 26, 2014 at 1:22 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
> On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
>> Pad the microcode to a multiple of 0x40, otherwise firmware will fail to
>> run from non-prepadded firmware files.
>>
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> ---
>>  drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>> index f997a18f5760..367e72daf8b1 100644
>> --- a/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>> +++ b/drivers/gpu/drm/nouveau/core/engine/graph/nvc0.c
>> @@ -768,6 +768,10 @@ nvc0_graph_init_fw(struct nvc0_graph_priv *priv, u32 fuc_base,
>>                         nv_wr32(priv, fuc_base + 0x0188, i >> 6);
>>                 nv_wr32(priv, fuc_base + 0x0184, code->data[i]);
>>         }
>> +
>> +       /* code must be padded to 0x40 */
>> +       for (; i < (((code->size / 4) + 0x3f) & ~0x3f); i++)
> "for (; i & 0x3f; i++)" would work just as well :)

Indeed. >_<

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 10/12] drm/nouveau/graph: add GK20A support
  2014-03-26  4:24         ` Ben Skeggs
@ 2014-04-02 14:03           ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 14:03 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau, linux-kernel, dri-devel, Ben Skeggs, linux-tegra

On Wed, Mar 26, 2014 at 1:24 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
> On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
>> Add a GR device for GK20A based on NVE4, with the correct classes
>> definitions (GK20A's 3D class is 0xa297).
>>
>> Most of the NVE4 code can be used on GK20A, so make relevant bits of
>> NVE4 available to other chips as well.
> This will need a bit of a rebase on top of the tree I mentioned
> earlier (also queued for drm-next now), where I've further split out
> and named the various chunks of state.

Will do that.

>
> Does GK104 match entirely correctly, or just happen to work?  I could
> probably hunt down the GK20A netlist images and check that actually :)

Do you mean, the init sequence? I haven't checked in detail (we are
certainly doing things differently in the non-DRM driver), but the
registers seem to match and the GPU is able to render after that. I
admit I have not looked much further for now.

The only register that does not exist on GK20A is 0x260, but when
accessing it Nouveau will be able to continue unharmed after a memory
fault.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 10/12] drm/nouveau/graph: add GK20A support
@ 2014-04-02 14:03           ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 14:03 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Alexandre Courbot, Ben Skeggs, nouveau, linux-kernel, dri-devel,
	linux-tegra

On Wed, Mar 26, 2014 at 1:24 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
> On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
>> Add a GR device for GK20A based on NVE4, with the correct classes
>> definitions (GK20A's 3D class is 0xa297).
>>
>> Most of the NVE4 code can be used on GK20A, so make relevant bits of
>> NVE4 available to other chips as well.
> This will need a bit of a rebase on top of the tree I mentioned
> earlier (also queued for drm-next now), where I've further split out
> and named the various chunks of state.

Will do that.

>
> Does GK104 match entirely correctly, or just happen to work?  I could
> probably hunt down the GK20A netlist images and check that actually :)

Do you mean, the init sequence? I haven't checked in detail (we are
certainly doing things differently in the non-DRM driver), but the
registers seem to match and the GPU is able to render after that. I
admit I have not looked much further for now.

The only register that does not exist on GK20A is 0x260, but when
accessing it Nouveau will be able to continue unharmed after a memory
fault.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()
  2014-03-26  4:27           ` Ben Skeggs
  (?)
@ 2014-04-02 14:14           ` Alexandre Courbot
  2014-04-02 14:23             ` [Nouveau] " Ilia Mirkin
  -1 siblings, 1 reply; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 14:14 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, Alexandre Courbot, nouveau, linux-kernel,
	dri-devel, Ben Skeggs, linux-tegra

On Wed, Mar 26, 2014 at 1:27 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
> On Tue, Mar 25, 2014 at 9:10 AM, Thierry Reding
> <thierry.reding@gmail.com> wrote:
>> On Mon, Mar 24, 2014 at 05:42:33PM +0900, Alexandre Courbot wrote:
>>> GK20A does not embed a dedicated COPY engine and thus cannot allocate
>>> the copy channel that nouveau_accel_init() attempts to create. It also
>>> lacks any display hardware, so the creation of a software channel does
>>> not apply neither.
>>
>> Perhaps this should be two separate patches?
>>
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
>> [...]
>>> +     if (device->chipset == 0xea) {
>>> +             /* gk20a does not have CE0/CE1 */
>>
>> This would be another good candidate for a feature flag.
> There are ways to query this in a chipset-independent way.  However,
> despite reporting it as an error if no copy engines are available, the
> code should continue on without the channel happily.  Perhaps we can
> just punt the relevent error messages to a debug loglevel for now?

Sure, that would be more future-proof as well.

>
>>
>>> +             arg0 = NVE0_CHANNEL_IND_ENGINE_GR;
>>> +             arg1 = 1;
>>> +     } else
>>>       if (device->card_type >= NV_E0) {
>>
>> The formatting here is somewhat weird. From a brief look I couldn't find
>> any indication that nouveau deviates from the standard coding style, so
>> this should be:
>>
>>         } else if (...) {
> I use the former in a few places, despite it not entirely being
> "correct".. It looks nicer though :)  I don't mind either way though.

Yeah, I just followed the style of the file here. Whether it needs to
change or not is not my call. :P

>>> +     /* Need to figure out how to handle sw for gk20a */
>>> +     if (device->chipset == 0xea)
>>> +             goto skip_sw_init;
>>
>> The commit message makes it sound like SW isn't needed since gk20a
>> "lacks any display hardware". In that case the comment here doesn't make
>> much sense.

Correct. As far as I have looked (that is, not very far), SW methods
are used for display-related functions, but there might be other
use-cases too. Maybe someone who knows better can confirm?

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Nouveau] [PATCH 06/12] drm/nouveau/ibus: add GK20A support
  2014-04-02 13:52           ` Alexandre Courbot
@ 2014-04-02 14:18               ` Ilia Mirkin
  -1 siblings, 0 replies; 100+ messages in thread
From: Ilia Mirkin @ 2014-04-02 14:18 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Thierry Reding, nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Linux Kernel Mailing List,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Wed, Apr 2, 2014 at 9:52 AM, Alexandre Courbot <gnurou-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Tue, Mar 25, 2014 at 7:34 AM, Thierry Reding
> <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> On Mon, Mar 24, 2014 at 05:42:28PM +0900, Alexandre Courbot wrote:
>> [...]
>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
>> [...]
>>> +#include <subdev/ibus.h>
>>> +
>>> +struct nvea_ibus_priv {
>>> +     struct nouveau_ibus base;
>>> +};
>>> +
>>> +static void
>>> +nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv)
>>> +{
>>> +     nv_mask(priv, 0x137250, 0x3f, 0);
>>> +
>>> +     nv_mask(priv, 0x000200, 0x20, 0);
>>> +     udelay(20);
>>
>> usleep_range()?
>
> Sure.
>
>>
>>> +static void
>>> +nvea_ibus_intr(struct nouveau_subdev *subdev)
>>> +{
>> [...]
>>> +     /* Acknowledge interrupt */
>>> +     nv_mask(priv, 0x12004c, 0x2, 0x2);
>>> +
>>> +     while (--retry >= 0) {
>>> +             command = nv_rd32(priv, 0x12004c) & 0x3f;
>>> +             if (command == 0)
>>> +                     break;
>>> +     }
>>> +
>>> +     if (retry < 0)
>>> +             nv_warn(priv, "timeout waiting for ringmaster ack\n");
>>> +}
>>
>> Perhaps I'm being paranoid, but this loop now depends on the frequency
>> of the various clocks involved and therefore might break at some point
>> if the frequencies get sufficiently high.
>>
>> So a slightly safer implementation would use a proper timeout using a
>> combination of msecs_to_jiffies(), time_before() and usleep_range(),
>> like so:
>>
>>         timeout = jiffies + msecs_to_jiffies(...);
>>
>>         while (time_before(jiffies, timeout)) {
>>                 command = nv_rd32(...) & 0x3f;
>>                 if (command == 0)
>>                         break;
>>
>>                 usleep_range(...);
>>         }
>>
>>         if (time_after(jiffies, timeout))
>>                 nv_warn(...);
>
> Right, now that I look at this code again I don't even understand why
> I left it this way. Maybe I left some early test code slip into the
> final patch, sorry about that.

I just remembered about this, but there's also the nv_wait() macro,
which you could use, e.g.

if (!nv_wait(subdev, 0x12004c, 0x3f, 0x00))
  nv_warn(...)

It has built-in timeout logic/etc (although no sleeps in the middle).
It does use the timer subdev, so if that's not operational at this
point, you can't use it.

>
>> This assumes that there's some known timeout after which the ringmaster
>> is expected to have acked the interrupt. On that note, I wonder if the
>> warning is accurate here: it's my understanding that writing 0x2 to the
>> register does acknowledge the interrupt, so the ringmaster does in fact
>> "clear" it rather than "acknowledge" it, doesn't it?
>>
>> Although now that I mention it I seem to remember that this write is
>> actually sending a command to the ring master and perhaps waiting for
>> the register to return to 0 is indeed waiting for an ACK of sorts. Maybe
>> adding a comment or so describing what this sequence does would be
>> appropriate here?
>
> Can we from an IP point of view? AFAIK this sequence has never been
> publicly documented.
> _______________________________________________
> Nouveau mailing list
> Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Nouveau] [PATCH 06/12] drm/nouveau/ibus: add GK20A support
@ 2014-04-02 14:18               ` Ilia Mirkin
  0 siblings, 0 replies; 100+ messages in thread
From: Ilia Mirkin @ 2014-04-02 14:18 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Thierry Reding, nouveau, Linux Kernel Mailing List, dri-devel,
	Ben Skeggs, linux-tegra

On Wed, Apr 2, 2014 at 9:52 AM, Alexandre Courbot <gnurou@gmail.com> wrote:
> On Tue, Mar 25, 2014 at 7:34 AM, Thierry Reding
> <thierry.reding@gmail.com> wrote:
>> On Mon, Mar 24, 2014 at 05:42:28PM +0900, Alexandre Courbot wrote:
>> [...]
>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
>> [...]
>>> +#include <subdev/ibus.h>
>>> +
>>> +struct nvea_ibus_priv {
>>> +     struct nouveau_ibus base;
>>> +};
>>> +
>>> +static void
>>> +nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv)
>>> +{
>>> +     nv_mask(priv, 0x137250, 0x3f, 0);
>>> +
>>> +     nv_mask(priv, 0x000200, 0x20, 0);
>>> +     udelay(20);
>>
>> usleep_range()?
>
> Sure.
>
>>
>>> +static void
>>> +nvea_ibus_intr(struct nouveau_subdev *subdev)
>>> +{
>> [...]
>>> +     /* Acknowledge interrupt */
>>> +     nv_mask(priv, 0x12004c, 0x2, 0x2);
>>> +
>>> +     while (--retry >= 0) {
>>> +             command = nv_rd32(priv, 0x12004c) & 0x3f;
>>> +             if (command == 0)
>>> +                     break;
>>> +     }
>>> +
>>> +     if (retry < 0)
>>> +             nv_warn(priv, "timeout waiting for ringmaster ack\n");
>>> +}
>>
>> Perhaps I'm being paranoid, but this loop now depends on the frequency
>> of the various clocks involved and therefore might break at some point
>> if the frequencies get sufficiently high.
>>
>> So a slightly safer implementation would use a proper timeout using a
>> combination of msecs_to_jiffies(), time_before() and usleep_range(),
>> like so:
>>
>>         timeout = jiffies + msecs_to_jiffies(...);
>>
>>         while (time_before(jiffies, timeout)) {
>>                 command = nv_rd32(...) & 0x3f;
>>                 if (command == 0)
>>                         break;
>>
>>                 usleep_range(...);
>>         }
>>
>>         if (time_after(jiffies, timeout))
>>                 nv_warn(...);
>
> Right, now that I look at this code again I don't even understand why
> I left it this way. Maybe I left some early test code slip into the
> final patch, sorry about that.

I just remembered about this, but there's also the nv_wait() macro,
which you could use, e.g.

if (!nv_wait(subdev, 0x12004c, 0x3f, 0x00))
  nv_warn(...)

It has built-in timeout logic/etc (although no sleeps in the middle).
It does use the timer subdev, so if that's not operational at this
point, you can't use it.

>
>> This assumes that there's some known timeout after which the ringmaster
>> is expected to have acked the interrupt. On that note, I wonder if the
>> warning is accurate here: it's my understanding that writing 0x2 to the
>> register does acknowledge the interrupt, so the ringmaster does in fact
>> "clear" it rather than "acknowledge" it, doesn't it?
>>
>> Although now that I mention it I seem to remember that this write is
>> actually sending a command to the ring master and perhaps waiting for
>> the register to return to 0 is indeed waiting for an ACK of sorts. Maybe
>> adding a comment or so describing what this sequence does would be
>> appropriate here?
>
> Can we from an IP point of view? AFAIK this sequence has never been
> publicly documented.
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Nouveau] [PATCH 12/12] drm/nouveau: support for probing GK20A
  2014-03-26  4:28       ` Ben Skeggs
@ 2014-04-02 14:19         ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 14:19 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau, linux-kernel, dri-devel, Ben Skeggs, linux-tegra

On Wed, Mar 26, 2014 at 1:28 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
> On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
>> Set the correct subdev/engine classes when GK20A (0xea) is probed.
>>
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> ---
>>  drivers/gpu/drm/nouveau/core/engine/device/nve0.c | 20 ++++++++++++++++++++
>>  1 file changed, 20 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
>> index 987edbc30a09..8509dd57de1f 100644
>> --- a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
>> +++ b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
>> @@ -156,6 +156,26 @@ nve0_identify(struct nouveau_device *device)
>>                 device->oclass[NVDEV_ENGINE_PPP    ] = &nvc0_ppp_oclass;
>>                 device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass;
>>                 break;
>> +       case 0xea:
>> +               device->cname = "GK20A";
>> +               device->oclass[NVDEV_SUBDEV_MC     ] =  nvc3_mc_oclass;
>> +               device->oclass[NVDEV_SUBDEV_BUS    ] =  nvc0_bus_oclass;
>> +               device->oclass[NVDEV_SUBDEV_TIMER  ] = &nv04_timer_oclass;
> As per note on the PTIMER patch, can just switch this to
> "gk20a_timer_oclass" on the latest code.

Just saw your patch and will try to use it for the next version (if I
can figure out how to reapply it to a Linux tree).

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Nouveau] [PATCH 12/12] drm/nouveau: support for probing GK20A
@ 2014-04-02 14:19         ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 14:19 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Alexandre Courbot, linux-tegra, nouveau, Ben Skeggs, dri-devel,
	linux-kernel

On Wed, Mar 26, 2014 at 1:28 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
> On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
>> Set the correct subdev/engine classes when GK20A (0xea) is probed.
>>
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> ---
>>  drivers/gpu/drm/nouveau/core/engine/device/nve0.c | 20 ++++++++++++++++++++
>>  1 file changed, 20 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
>> index 987edbc30a09..8509dd57de1f 100644
>> --- a/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
>> +++ b/drivers/gpu/drm/nouveau/core/engine/device/nve0.c
>> @@ -156,6 +156,26 @@ nve0_identify(struct nouveau_device *device)
>>                 device->oclass[NVDEV_ENGINE_PPP    ] = &nvc0_ppp_oclass;
>>                 device->oclass[NVDEV_ENGINE_PERFMON] = &nve0_perfmon_oclass;
>>                 break;
>> +       case 0xea:
>> +               device->cname = "GK20A";
>> +               device->oclass[NVDEV_SUBDEV_MC     ] =  nvc3_mc_oclass;
>> +               device->oclass[NVDEV_SUBDEV_BUS    ] =  nvc0_bus_oclass;
>> +               device->oclass[NVDEV_SUBDEV_TIMER  ] = &nv04_timer_oclass;
> As per note on the PTIMER patch, can just switch this to
> "gk20a_timer_oclass" on the latest code.

Just saw your patch and will try to use it for the next version (if I
can figure out how to reapply it to a Linux tree).

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 06/12] drm/nouveau/ibus: add GK20A support
  2014-04-02 14:18               ` Ilia Mirkin
@ 2014-04-02 14:22                   ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 14:22 UTC (permalink / raw)
  To: Ilia Mirkin
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Linux Kernel Mailing List,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Wed, Apr 2, 2014 at 11:18 PM, Ilia Mirkin <imirkin-FrUbXkNCsVf2fBVCVOL8/A@public.gmane.org> wrote:
> On Wed, Apr 2, 2014 at 9:52 AM, Alexandre Courbot <gnurou-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> On Tue, Mar 25, 2014 at 7:34 AM, Thierry Reding
>> <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>> On Mon, Mar 24, 2014 at 05:42:28PM +0900, Alexandre Courbot wrote:
>>> [...]
>>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
>>> [...]
>>>> +#include <subdev/ibus.h>
>>>> +
>>>> +struct nvea_ibus_priv {
>>>> +     struct nouveau_ibus base;
>>>> +};
>>>> +
>>>> +static void
>>>> +nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv)
>>>> +{
>>>> +     nv_mask(priv, 0x137250, 0x3f, 0);
>>>> +
>>>> +     nv_mask(priv, 0x000200, 0x20, 0);
>>>> +     udelay(20);
>>>
>>> usleep_range()?
>>
>> Sure.
>>
>>>
>>>> +static void
>>>> +nvea_ibus_intr(struct nouveau_subdev *subdev)
>>>> +{
>>> [...]
>>>> +     /* Acknowledge interrupt */
>>>> +     nv_mask(priv, 0x12004c, 0x2, 0x2);
>>>> +
>>>> +     while (--retry >= 0) {
>>>> +             command = nv_rd32(priv, 0x12004c) & 0x3f;
>>>> +             if (command == 0)
>>>> +                     break;
>>>> +     }
>>>> +
>>>> +     if (retry < 0)
>>>> +             nv_warn(priv, "timeout waiting for ringmaster ack\n");
>>>> +}
>>>
>>> Perhaps I'm being paranoid, but this loop now depends on the frequency
>>> of the various clocks involved and therefore might break at some point
>>> if the frequencies get sufficiently high.
>>>
>>> So a slightly safer implementation would use a proper timeout using a
>>> combination of msecs_to_jiffies(), time_before() and usleep_range(),
>>> like so:
>>>
>>>         timeout = jiffies + msecs_to_jiffies(...);
>>>
>>>         while (time_before(jiffies, timeout)) {
>>>                 command = nv_rd32(...) & 0x3f;
>>>                 if (command == 0)
>>>                         break;
>>>
>>>                 usleep_range(...);
>>>         }
>>>
>>>         if (time_after(jiffies, timeout))
>>>                 nv_warn(...);
>>
>> Right, now that I look at this code again I don't even understand why
>> I left it this way. Maybe I left some early test code slip into the
>> final patch, sorry about that.
>
> I just remembered about this, but there's also the nv_wait() macro,
> which you could use, e.g.
>
> if (!nv_wait(subdev, 0x12004c, 0x3f, 0x00))
>   nv_warn(...)
>
> It has built-in timeout logic/etc (although no sleeps in the middle).
> It does use the timer subdev, so if that's not operational at this
> point, you can't use it.

IBUS comes after TIMER in the nv_subdev_type enum, so I guess that
should work. Will try this solution, thanks!

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Nouveau] [PATCH 06/12] drm/nouveau/ibus: add GK20A support
@ 2014-04-02 14:22                   ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-02 14:22 UTC (permalink / raw)
  To: Ilia Mirkin
  Cc: Thierry Reding, nouveau, Linux Kernel Mailing List, dri-devel,
	Ben Skeggs, linux-tegra

On Wed, Apr 2, 2014 at 11:18 PM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
> On Wed, Apr 2, 2014 at 9:52 AM, Alexandre Courbot <gnurou@gmail.com> wrote:
>> On Tue, Mar 25, 2014 at 7:34 AM, Thierry Reding
>> <thierry.reding@gmail.com> wrote:
>>> On Mon, Mar 24, 2014 at 05:42:28PM +0900, Alexandre Courbot wrote:
>>> [...]
>>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c b/drivers/gpu/drm/nouveau/core/subdev/ibus/nvea.c
>>> [...]
>>>> +#include <subdev/ibus.h>
>>>> +
>>>> +struct nvea_ibus_priv {
>>>> +     struct nouveau_ibus base;
>>>> +};
>>>> +
>>>> +static void
>>>> +nvea_ibus_init_priv_ring(struct nvea_ibus_priv *priv)
>>>> +{
>>>> +     nv_mask(priv, 0x137250, 0x3f, 0);
>>>> +
>>>> +     nv_mask(priv, 0x000200, 0x20, 0);
>>>> +     udelay(20);
>>>
>>> usleep_range()?
>>
>> Sure.
>>
>>>
>>>> +static void
>>>> +nvea_ibus_intr(struct nouveau_subdev *subdev)
>>>> +{
>>> [...]
>>>> +     /* Acknowledge interrupt */
>>>> +     nv_mask(priv, 0x12004c, 0x2, 0x2);
>>>> +
>>>> +     while (--retry >= 0) {
>>>> +             command = nv_rd32(priv, 0x12004c) & 0x3f;
>>>> +             if (command == 0)
>>>> +                     break;
>>>> +     }
>>>> +
>>>> +     if (retry < 0)
>>>> +             nv_warn(priv, "timeout waiting for ringmaster ack\n");
>>>> +}
>>>
>>> Perhaps I'm being paranoid, but this loop now depends on the frequency
>>> of the various clocks involved and therefore might break at some point
>>> if the frequencies get sufficiently high.
>>>
>>> So a slightly safer implementation would use a proper timeout using a
>>> combination of msecs_to_jiffies(), time_before() and usleep_range(),
>>> like so:
>>>
>>>         timeout = jiffies + msecs_to_jiffies(...);
>>>
>>>         while (time_before(jiffies, timeout)) {
>>>                 command = nv_rd32(...) & 0x3f;
>>>                 if (command == 0)
>>>                         break;
>>>
>>>                 usleep_range(...);
>>>         }
>>>
>>>         if (time_after(jiffies, timeout))
>>>                 nv_warn(...);
>>
>> Right, now that I look at this code again I don't even understand why
>> I left it this way. Maybe I left some early test code slip into the
>> final patch, sorry about that.
>
> I just remembered about this, but there's also the nv_wait() macro,
> which you could use, e.g.
>
> if (!nv_wait(subdev, 0x12004c, 0x3f, 0x00))
>   nv_warn(...)
>
> It has built-in timeout logic/etc (although no sleeps in the middle).
> It does use the timer subdev, so if that's not operational at this
> point, you can't use it.

IBUS comes after TIMER in the nv_subdev_type enum, so I guess that
should work. Will try this solution, thanks!

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Nouveau] [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()
  2014-04-02 14:14           ` Alexandre Courbot
@ 2014-04-02 14:23             ` Ilia Mirkin
  2014-04-02 23:14               ` Ben Skeggs
  0 siblings, 1 reply; 100+ messages in thread
From: Ilia Mirkin @ 2014-04-02 14:23 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, nouveau, linux-kernel, dri-devel, Ben Skeggs, linux-tegra

On Wed, Apr 2, 2014 at 10:14 AM, Alexandre Courbot <gnurou@gmail.com> wrote:
>>>> +     /* Need to figure out how to handle sw for gk20a */
>>>> +     if (device->chipset == 0xea)
>>>> +             goto skip_sw_init;
>>>
>>> The commit message makes it sound like SW isn't needed since gk20a
>>> "lacks any display hardware". In that case the comment here doesn't make
>>> much sense.
>
> Correct. As far as I have looked (that is, not very far), SW methods
> are used for display-related functions, but there might be other
> use-cases too. Maybe someone who knows better can confirm?

http://cgit.freedesktop.org/~darktama/nouveau/tree/nvkm/engine/software/nvc0.c

Take a look at nvc0_software_mthd_mp_control -- that's used in the
mesa driver to... well, control those things :) They're related to
PGRAPH, which should be available on the GK20A. [Not sure what they do
though. One of them is about turning off error reporting.]

  -ilia

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 10/12] drm/nouveau/graph: add GK20A support
  2014-04-02 14:03           ` Alexandre Courbot
@ 2014-04-02 23:11             ` Ben Skeggs
  -1 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-04-02 23:11 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau, linux-kernel, dri-devel, Ben Skeggs, linux-tegra

On Thu, Apr 3, 2014 at 12:03 AM, Alexandre Courbot <gnurou@gmail.com> wrote:
> On Wed, Mar 26, 2014 at 1:24 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
>> On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
>>> Add a GR device for GK20A based on NVE4, with the correct classes
>>> definitions (GK20A's 3D class is 0xa297).
>>>
>>> Most of the NVE4 code can be used on GK20A, so make relevant bits of
>>> NVE4 available to other chips as well.
>> This will need a bit of a rebase on top of the tree I mentioned
>> earlier (also queued for drm-next now), where I've further split out
>> and named the various chunks of state.
>
> Will do that.
>
>>
>> Does GK104 match entirely correctly, or just happen to work?  I could
>> probably hunt down the GK20A netlist images and check that actually :)
>
> Do you mean, the init sequence? I haven't checked in detail (we are
> certainly doing things differently in the non-DRM driver), but the
> registers seem to match and the GPU is able to render after that. I
> admit I have not looked much further for now.
I meant the arrays of register data, there's generally been some
differences for most major chipset bumps.  Where can I find the
netlist firmware packages that go with the android driver?  I can pull
all the required info out of there to check :)

>
> The only register that does not exist on GK20A is 0x260, but when
> accessing it Nouveau will be able to continue unharmed after a memory
> fault.
I have it in my queue to fix that too, the register doesn't exist on a
couple of the other chips we support too.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 10/12] drm/nouveau/graph: add GK20A support
@ 2014-04-02 23:11             ` Ben Skeggs
  0 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-04-02 23:11 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Alexandre Courbot, Ben Skeggs, nouveau, linux-kernel, dri-devel,
	linux-tegra

On Thu, Apr 3, 2014 at 12:03 AM, Alexandre Courbot <gnurou@gmail.com> wrote:
> On Wed, Mar 26, 2014 at 1:24 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
>> On Mon, Mar 24, 2014 at 6:42 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
>>> Add a GR device for GK20A based on NVE4, with the correct classes
>>> definitions (GK20A's 3D class is 0xa297).
>>>
>>> Most of the NVE4 code can be used on GK20A, so make relevant bits of
>>> NVE4 available to other chips as well.
>> This will need a bit of a rebase on top of the tree I mentioned
>> earlier (also queued for drm-next now), where I've further split out
>> and named the various chunks of state.
>
> Will do that.
>
>>
>> Does GK104 match entirely correctly, or just happen to work?  I could
>> probably hunt down the GK20A netlist images and check that actually :)
>
> Do you mean, the init sequence? I haven't checked in detail (we are
> certainly doing things differently in the non-DRM driver), but the
> registers seem to match and the GPU is able to render after that. I
> admit I have not looked much further for now.
I meant the arrays of register data, there's generally been some
differences for most major chipset bumps.  Where can I find the
netlist firmware packages that go with the android driver?  I can pull
all the required info out of there to check :)

>
> The only register that does not exist on GK20A is 0x260, but when
> accessing it Nouveau will be able to continue unharmed after a memory
> fault.
I have it in my queue to fix that too, the register doesn't exist on a
couple of the other chips we support too.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Nouveau] [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()
  2014-04-02 14:23             ` [Nouveau] " Ilia Mirkin
@ 2014-04-02 23:14               ` Ben Skeggs
  0 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-04-02 23:14 UTC (permalink / raw)
  To: Ilia Mirkin
  Cc: Alexandre Courbot, nouveau, linux-kernel, dri-devel, Ben Skeggs,
	linux-tegra

On Thu, Apr 3, 2014 at 12:23 AM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
> On Wed, Apr 2, 2014 at 10:14 AM, Alexandre Courbot <gnurou@gmail.com> wrote:
>>>>> +     /* Need to figure out how to handle sw for gk20a */
>>>>> +     if (device->chipset == 0xea)
>>>>> +             goto skip_sw_init;
>>>>
>>>> The commit message makes it sound like SW isn't needed since gk20a
>>>> "lacks any display hardware". In that case the comment here doesn't make
>>>> much sense.
>>
>> Correct. As far as I have looked (that is, not very far), SW methods
>> are used for display-related functions, but there might be other
>> use-cases too. Maybe someone who knows better can confirm?
>
> http://cgit.freedesktop.org/~darktama/nouveau/tree/nvkm/engine/software/nvc0.c
>
> Take a look at nvc0_software_mthd_mp_control -- that's used in the
> mesa driver to... well, control those things :) They're related to
> PGRAPH, which should be available on the GK20A. [Not sure what they do
> though. One of them is about turning off error reporting.]
Hm, I don't know why I didn't notice that was there for now.. I
wouldn't be entirely surprised if nvidia had reserved method ids in
the 3d object class that were supposed to be used for this, rather
than from a sw object class..

>
>   -ilia

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
  2014-03-26  4:19           ` Ben Skeggs
@ 2014-04-11  2:46               ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-11  2:46 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Wed, Mar 26, 2014 at 1:19 PM, Ben Skeggs <skeggsb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Tue, Mar 25, 2014 at 7:54 AM, Thierry Reding
> <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
>>> GK20A's timer is directly attached to the system timer and cannot be
>>> calibrated. Skip the calibration phase on that chip since the
>>> corresponding registers do not exist.
>>>
>>> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>>> ---
>>>  drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
>>>  1 file changed, 13 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>> index c0bdd10358d7..822fe0d8a871 100644
>>> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>>>       if (ret)
>>>               return ret;
>>>
>>> +     /* gk20a does not have the calibration registers */
>>> +     if (device->chipset == 0xea)
>>> +             goto skip_clk_init;
>>
>> I'm concerned that this won't scale in the future. Perhaps a better
>> solution would be to add a "flags" or "features" field to struct
>> nouveau_device along with feature bits such as HAS_CALIBRATION or
>> similar.
>>
>> That way we don't have to touch this code for every new future Tegra
>> chip. Unless perhaps if there's a reason to expect things to change in
>> newer generations.
> I've already handled this in a slightly different way in the tree I'd
> previously pointed Alex at (I think!), as I needed to do the same for
> GM107.
>
> Should just be able to use that implementation (so, just change the
> probe patch) here too.

I will skip this patch and use your implementation then. Btw,
shouldn't the source file for the GK20A implementation be named nvea.c
instead of gk20a.c?

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
@ 2014-04-11  2:46               ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-11  2:46 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, Alexandre Courbot, nouveau, linux-kernel,
	dri-devel, Ben Skeggs, linux-tegra

On Wed, Mar 26, 2014 at 1:19 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
> On Tue, Mar 25, 2014 at 7:54 AM, Thierry Reding
> <thierry.reding@gmail.com> wrote:
>> On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
>>> GK20A's timer is directly attached to the system timer and cannot be
>>> calibrated. Skip the calibration phase on that chip since the
>>> corresponding registers do not exist.
>>>
>>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>>> ---
>>>  drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
>>>  1 file changed, 13 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>> index c0bdd10358d7..822fe0d8a871 100644
>>> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>>>       if (ret)
>>>               return ret;
>>>
>>> +     /* gk20a does not have the calibration registers */
>>> +     if (device->chipset == 0xea)
>>> +             goto skip_clk_init;
>>
>> I'm concerned that this won't scale in the future. Perhaps a better
>> solution would be to add a "flags" or "features" field to struct
>> nouveau_device along with feature bits such as HAS_CALIBRATION or
>> similar.
>>
>> That way we don't have to touch this code for every new future Tegra
>> chip. Unless perhaps if there's a reason to expect things to change in
>> newer generations.
> I've already handled this in a slightly different way in the tree I'd
> previously pointed Alex at (I think!), as I needed to do the same for
> GM107.
>
> Should just be able to use that implementation (so, just change the
> probe patch) here too.

I will skip this patch and use your implementation then. Btw,
shouldn't the source file for the GK20A implementation be named nvea.c
instead of gk20a.c?

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
  2014-04-11  2:46               ` Alexandre Courbot
@ 2014-04-11  7:31                   ` Ben Skeggs
  -1 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-04-11  7:31 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Fri, Apr 11, 2014 at 12:46 PM, Alexandre Courbot <gnurou-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Wed, Mar 26, 2014 at 1:19 PM, Ben Skeggs <skeggsb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> On Tue, Mar 25, 2014 at 7:54 AM, Thierry Reding
>> <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>> On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
>>>> GK20A's timer is directly attached to the system timer and cannot be
>>>> calibrated. Skip the calibration phase on that chip since the
>>>> corresponding registers do not exist.
>>>>
>>>> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>>>> ---
>>>>  drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
>>>>  1 file changed, 13 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>> index c0bdd10358d7..822fe0d8a871 100644
>>>> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>>>>       if (ret)
>>>>               return ret;
>>>>
>>>> +     /* gk20a does not have the calibration registers */
>>>> +     if (device->chipset == 0xea)
>>>> +             goto skip_clk_init;
>>>
>>> I'm concerned that this won't scale in the future. Perhaps a better
>>> solution would be to add a "flags" or "features" field to struct
>>> nouveau_device along with feature bits such as HAS_CALIBRATION or
>>> similar.
>>>
>>> That way we don't have to touch this code for every new future Tegra
>>> chip. Unless perhaps if there's a reason to expect things to change in
>>> newer generations.
>> I've already handled this in a slightly different way in the tree I'd
>> previously pointed Alex at (I think!), as I needed to do the same for
>> GM107.
>>
>> Should just be able to use that implementation (so, just change the
>> probe patch) here too.
>
> I will skip this patch and use your implementation then. Btw,
> shouldn't the source file for the GK20A implementation be named nvea.c
> instead of gk20a.c?
For the Maxwell stuff I've been using "gm107" now too.  Since we're
working with you guys these days it seems better to use the same names
for things ;)

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
@ 2014-04-11  7:31                   ` Ben Skeggs
  0 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-04-11  7:31 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Thierry Reding, Alexandre Courbot, nouveau, linux-kernel,
	dri-devel, Ben Skeggs, linux-tegra

On Fri, Apr 11, 2014 at 12:46 PM, Alexandre Courbot <gnurou@gmail.com> wrote:
> On Wed, Mar 26, 2014 at 1:19 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
>> On Tue, Mar 25, 2014 at 7:54 AM, Thierry Reding
>> <thierry.reding@gmail.com> wrote:
>>> On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
>>>> GK20A's timer is directly attached to the system timer and cannot be
>>>> calibrated. Skip the calibration phase on that chip since the
>>>> corresponding registers do not exist.
>>>>
>>>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>>>> ---
>>>>  drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
>>>>  1 file changed, 13 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>> index c0bdd10358d7..822fe0d8a871 100644
>>>> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>>>>       if (ret)
>>>>               return ret;
>>>>
>>>> +     /* gk20a does not have the calibration registers */
>>>> +     if (device->chipset == 0xea)
>>>> +             goto skip_clk_init;
>>>
>>> I'm concerned that this won't scale in the future. Perhaps a better
>>> solution would be to add a "flags" or "features" field to struct
>>> nouveau_device along with feature bits such as HAS_CALIBRATION or
>>> similar.
>>>
>>> That way we don't have to touch this code for every new future Tegra
>>> chip. Unless perhaps if there's a reason to expect things to change in
>>> newer generations.
>> I've already handled this in a slightly different way in the tree I'd
>> previously pointed Alex at (I think!), as I needed to do the same for
>> GM107.
>>
>> Should just be able to use that implementation (so, just change the
>> probe patch) here too.
>
> I will skip this patch and use your implementation then. Btw,
> shouldn't the source file for the GK20A implementation be named nvea.c
> instead of gk20a.c?
For the Maxwell stuff I've been using "gm107" now too.  Since we're
working with you guys these days it seems better to use the same names
for things ;)

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
  2014-04-11  7:31                   ` Ben Skeggs
@ 2014-04-11  7:34                       ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-11  7:34 UTC (permalink / raw)
  To: Ben Skeggs, Alexandre Courbot
  Cc: Thierry Reding, nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On 04/11/2014 04:31 PM, Ben Skeggs wrote:
> On Fri, Apr 11, 2014 at 12:46 PM, Alexandre Courbot <gnurou-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> On Wed, Mar 26, 2014 at 1:19 PM, Ben Skeggs <skeggsb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>> On Tue, Mar 25, 2014 at 7:54 AM, Thierry Reding
>>> <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>> On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
>>>>> GK20A's timer is directly attached to the system timer and cannot be
>>>>> calibrated. Skip the calibration phase on that chip since the
>>>>> corresponding registers do not exist.
>>>>>
>>>>> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>>>>> ---
>>>>>   drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
>>>>>   1 file changed, 13 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>> index c0bdd10358d7..822fe0d8a871 100644
>>>>> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>>>>>        if (ret)
>>>>>                return ret;
>>>>>
>>>>> +     /* gk20a does not have the calibration registers */
>>>>> +     if (device->chipset == 0xea)
>>>>> +             goto skip_clk_init;
>>>>
>>>> I'm concerned that this won't scale in the future. Perhaps a better
>>>> solution would be to add a "flags" or "features" field to struct
>>>> nouveau_device along with feature bits such as HAS_CALIBRATION or
>>>> similar.
>>>>
>>>> That way we don't have to touch this code for every new future Tegra
>>>> chip. Unless perhaps if there's a reason to expect things to change in
>>>> newer generations.
>>> I've already handled this in a slightly different way in the tree I'd
>>> previously pointed Alex at (I think!), as I needed to do the same for
>>> GM107.
>>>
>>> Should just be able to use that implementation (so, just change the
>>> probe patch) here too.
>>
>> I will skip this patch and use your implementation then. Btw,
>> shouldn't the source file for the GK20A implementation be named nvea.c
>> instead of gk20a.c?
> For the Maxwell stuff I've been using "gm107" now too.  Since we're
> working with you guys these days it seems better to use the same names
> for things ;)

So would you like us to use the same naming scheme as well? So far all 
my patches use "nvea.c" whenever I need to add code.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
@ 2014-04-11  7:34                       ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-11  7:34 UTC (permalink / raw)
  To: Ben Skeggs, Alexandre Courbot
  Cc: Thierry Reding, nouveau, linux-kernel, dri-devel, Ben Skeggs,
	linux-tegra

On 04/11/2014 04:31 PM, Ben Skeggs wrote:
> On Fri, Apr 11, 2014 at 12:46 PM, Alexandre Courbot <gnurou@gmail.com> wrote:
>> On Wed, Mar 26, 2014 at 1:19 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
>>> On Tue, Mar 25, 2014 at 7:54 AM, Thierry Reding
>>> <thierry.reding@gmail.com> wrote:
>>>> On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
>>>>> GK20A's timer is directly attached to the system timer and cannot be
>>>>> calibrated. Skip the calibration phase on that chip since the
>>>>> corresponding registers do not exist.
>>>>>
>>>>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>>>>> ---
>>>>>   drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19 +++++++++++++------
>>>>>   1 file changed, 13 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>> index c0bdd10358d7..822fe0d8a871 100644
>>>>> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>>>>>        if (ret)
>>>>>                return ret;
>>>>>
>>>>> +     /* gk20a does not have the calibration registers */
>>>>> +     if (device->chipset == 0xea)
>>>>> +             goto skip_clk_init;
>>>>
>>>> I'm concerned that this won't scale in the future. Perhaps a better
>>>> solution would be to add a "flags" or "features" field to struct
>>>> nouveau_device along with feature bits such as HAS_CALIBRATION or
>>>> similar.
>>>>
>>>> That way we don't have to touch this code for every new future Tegra
>>>> chip. Unless perhaps if there's a reason to expect things to change in
>>>> newer generations.
>>> I've already handled this in a slightly different way in the tree I'd
>>> previously pointed Alex at (I think!), as I needed to do the same for
>>> GM107.
>>>
>>> Should just be able to use that implementation (so, just change the
>>> probe patch) here too.
>>
>> I will skip this patch and use your implementation then. Btw,
>> shouldn't the source file for the GK20A implementation be named nvea.c
>> instead of gk20a.c?
> For the Maxwell stuff I've been using "gm107" now too.  Since we're
> working with you guys these days it seems better to use the same names
> for things ;)

So would you like us to use the same naming scheme as well? So far all 
my patches use "nvea.c" whenever I need to add code.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
  2014-04-11  7:34                       ` Alexandre Courbot
@ 2014-04-14  8:35                         ` Ben Skeggs
  -1 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-04-14  8:35 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Alexandre Courbot, nouveau, linux-kernel, dri-devel, Ben Skeggs,
	linux-tegra

On Fri, Apr 11, 2014 at 5:34 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
> On 04/11/2014 04:31 PM, Ben Skeggs wrote:
>>
>> On Fri, Apr 11, 2014 at 12:46 PM, Alexandre Courbot <gnurou@gmail.com>
>> wrote:
>>>
>>> On Wed, Mar 26, 2014 at 1:19 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
>>>>
>>>> On Tue, Mar 25, 2014 at 7:54 AM, Thierry Reding
>>>> <thierry.reding@gmail.com> wrote:
>>>>>
>>>>> On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
>>>>>>
>>>>>> GK20A's timer is directly attached to the system timer and cannot be
>>>>>> calibrated. Skip the calibration phase on that chip since the
>>>>>> corresponding registers do not exist.
>>>>>>
>>>>>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19
>>>>>> +++++++++++++------
>>>>>>   1 file changed, 13 insertions(+), 6 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>> b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>> index c0bdd10358d7..822fe0d8a871 100644
>>>>>> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>>>>>>        if (ret)
>>>>>>                return ret;
>>>>>>
>>>>>> +     /* gk20a does not have the calibration registers */
>>>>>> +     if (device->chipset == 0xea)
>>>>>> +             goto skip_clk_init;
>>>>>
>>>>>
>>>>> I'm concerned that this won't scale in the future. Perhaps a better
>>>>> solution would be to add a "flags" or "features" field to struct
>>>>> nouveau_device along with feature bits such as HAS_CALIBRATION or
>>>>> similar.
>>>>>
>>>>> That way we don't have to touch this code for every new future Tegra
>>>>> chip. Unless perhaps if there's a reason to expect things to change in
>>>>> newer generations.
>>>>
>>>> I've already handled this in a slightly different way in the tree I'd
>>>> previously pointed Alex at (I think!), as I needed to do the same for
>>>> GM107.
>>>>
>>>> Should just be able to use that implementation (so, just change the
>>>> probe patch) here too.
>>>
>>>
>>> I will skip this patch and use your implementation then. Btw,
>>> shouldn't the source file for the GK20A implementation be named nvea.c
>>> instead of gk20a.c?
>>
>> For the Maxwell stuff I've been using "gm107" now too.  Since we're
>> working with you guys these days it seems better to use the same names
>> for things ;)
>
>
> So would you like us to use the same naming scheme as well? So far all my
> patches use "nvea.c" whenever I need to add code.
If it's not too much of a problem at this point, then that'd be good.
Right before I send -next for the next merge window I'll likely do a
mass rename anyway, so if we can get your patches merged before then
(which would be really good!), it doesn't matter much.

Ben.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
@ 2014-04-14  8:35                         ` Ben Skeggs
  0 siblings, 0 replies; 100+ messages in thread
From: Ben Skeggs @ 2014-04-14  8:35 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Alexandre Courbot, Thierry Reding, nouveau, linux-kernel,
	dri-devel, Ben Skeggs, linux-tegra

On Fri, Apr 11, 2014 at 5:34 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
> On 04/11/2014 04:31 PM, Ben Skeggs wrote:
>>
>> On Fri, Apr 11, 2014 at 12:46 PM, Alexandre Courbot <gnurou@gmail.com>
>> wrote:
>>>
>>> On Wed, Mar 26, 2014 at 1:19 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
>>>>
>>>> On Tue, Mar 25, 2014 at 7:54 AM, Thierry Reding
>>>> <thierry.reding@gmail.com> wrote:
>>>>>
>>>>> On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
>>>>>>
>>>>>> GK20A's timer is directly attached to the system timer and cannot be
>>>>>> calibrated. Skip the calibration phase on that chip since the
>>>>>> corresponding registers do not exist.
>>>>>>
>>>>>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19
>>>>>> +++++++++++++------
>>>>>>   1 file changed, 13 insertions(+), 6 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>> b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>> index c0bdd10358d7..822fe0d8a871 100644
>>>>>> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>>>>>>        if (ret)
>>>>>>                return ret;
>>>>>>
>>>>>> +     /* gk20a does not have the calibration registers */
>>>>>> +     if (device->chipset == 0xea)
>>>>>> +             goto skip_clk_init;
>>>>>
>>>>>
>>>>> I'm concerned that this won't scale in the future. Perhaps a better
>>>>> solution would be to add a "flags" or "features" field to struct
>>>>> nouveau_device along with feature bits such as HAS_CALIBRATION or
>>>>> similar.
>>>>>
>>>>> That way we don't have to touch this code for every new future Tegra
>>>>> chip. Unless perhaps if there's a reason to expect things to change in
>>>>> newer generations.
>>>>
>>>> I've already handled this in a slightly different way in the tree I'd
>>>> previously pointed Alex at (I think!), as I needed to do the same for
>>>> GM107.
>>>>
>>>> Should just be able to use that implementation (so, just change the
>>>> probe patch) here too.
>>>
>>>
>>> I will skip this patch and use your implementation then. Btw,
>>> shouldn't the source file for the GK20A implementation be named nvea.c
>>> instead of gk20a.c?
>>
>> For the Maxwell stuff I've been using "gm107" now too.  Since we're
>> working with you guys these days it seems better to use the same names
>> for things ;)
>
>
> So would you like us to use the same naming scheme as well? So far all my
> patches use "nvea.c" whenever I need to add code.
If it's not too much of a problem at this point, then that'd be good.
Right before I send -next for the next merge window I'll likely do a
mass rename anyway, so if we can get your patches merged before then
(which would be really good!), it doesn't matter much.

Ben.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
  2014-04-14  8:35                         ` Ben Skeggs
@ 2014-04-15  6:10                             ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-15  6:10 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Mon, Apr 14, 2014 at 5:35 PM, Ben Skeggs <skeggsb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Fri, Apr 11, 2014 at 5:34 PM, Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> wrote:
>> On 04/11/2014 04:31 PM, Ben Skeggs wrote:
>>>
>>> On Fri, Apr 11, 2014 at 12:46 PM, Alexandre Courbot <gnurou-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>> wrote:
>>>>
>>>> On Wed, Mar 26, 2014 at 1:19 PM, Ben Skeggs <skeggsb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>>>
>>>>> On Tue, Mar 25, 2014 at 7:54 AM, Thierry Reding
>>>>> <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>>>>
>>>>>> On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
>>>>>>>
>>>>>>> GK20A's timer is directly attached to the system timer and cannot be
>>>>>>> calibrated. Skip the calibration phase on that chip since the
>>>>>>> corresponding registers do not exist.
>>>>>>>
>>>>>>> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>>>>>>> ---
>>>>>>>   drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19
>>>>>>> +++++++++++++------
>>>>>>>   1 file changed, 13 insertions(+), 6 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>>> b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>>> index c0bdd10358d7..822fe0d8a871 100644
>>>>>>> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>>> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>>> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>>>>>>>        if (ret)
>>>>>>>                return ret;
>>>>>>>
>>>>>>> +     /* gk20a does not have the calibration registers */
>>>>>>> +     if (device->chipset == 0xea)
>>>>>>> +             goto skip_clk_init;
>>>>>>
>>>>>>
>>>>>> I'm concerned that this won't scale in the future. Perhaps a better
>>>>>> solution would be to add a "flags" or "features" field to struct
>>>>>> nouveau_device along with feature bits such as HAS_CALIBRATION or
>>>>>> similar.
>>>>>>
>>>>>> That way we don't have to touch this code for every new future Tegra
>>>>>> chip. Unless perhaps if there's a reason to expect things to change in
>>>>>> newer generations.
>>>>>
>>>>> I've already handled this in a slightly different way in the tree I'd
>>>>> previously pointed Alex at (I think!), as I needed to do the same for
>>>>> GM107.
>>>>>
>>>>> Should just be able to use that implementation (so, just change the
>>>>> probe patch) here too.
>>>>
>>>>
>>>> I will skip this patch and use your implementation then. Btw,
>>>> shouldn't the source file for the GK20A implementation be named nvea.c
>>>> instead of gk20a.c?
>>>
>>> For the Maxwell stuff I've been using "gm107" now too.  Since we're
>>> working with you guys these days it seems better to use the same names
>>> for things ;)
>>
>>
>> So would you like us to use the same naming scheme as well? So far all my
>> patches use "nvea.c" whenever I need to add code.
> If it's not too much of a problem at this point, then that'd be good.
> Right before I send -next for the next merge window I'll likely do a
> mass rename anyway, so if we can get your patches merged before then
> (which would be really good!), it doesn't matter much.

No problem, I will update the naming to follow what you did with the
timer driver and gm107.

Hopefully I will soon manage to carve out some time to rebase these
patches and send v2.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A
@ 2014-04-15  6:10                             ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-15  6:10 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Alexandre Courbot, Thierry Reding, nouveau, linux-kernel,
	dri-devel, Ben Skeggs, linux-tegra

On Mon, Apr 14, 2014 at 5:35 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
> On Fri, Apr 11, 2014 at 5:34 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
>> On 04/11/2014 04:31 PM, Ben Skeggs wrote:
>>>
>>> On Fri, Apr 11, 2014 at 12:46 PM, Alexandre Courbot <gnurou@gmail.com>
>>> wrote:
>>>>
>>>> On Wed, Mar 26, 2014 at 1:19 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
>>>>>
>>>>> On Tue, Mar 25, 2014 at 7:54 AM, Thierry Reding
>>>>> <thierry.reding@gmail.com> wrote:
>>>>>>
>>>>>> On Mon, Mar 24, 2014 at 05:42:24PM +0900, Alexandre Courbot wrote:
>>>>>>>
>>>>>>> GK20A's timer is directly attached to the system timer and cannot be
>>>>>>> calibrated. Skip the calibration phase on that chip since the
>>>>>>> corresponding registers do not exist.
>>>>>>>
>>>>>>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>>>>>>> ---
>>>>>>>   drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c | 19
>>>>>>> +++++++++++++------
>>>>>>>   1 file changed, 13 insertions(+), 6 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>>> b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>>> index c0bdd10358d7..822fe0d8a871 100644
>>>>>>> --- a/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>>> +++ b/drivers/gpu/drm/nouveau/core/subdev/timer/nv04.c
>>>>>>> @@ -185,6 +185,10 @@ nv04_timer_init(struct nouveau_object *object)
>>>>>>>        if (ret)
>>>>>>>                return ret;
>>>>>>>
>>>>>>> +     /* gk20a does not have the calibration registers */
>>>>>>> +     if (device->chipset == 0xea)
>>>>>>> +             goto skip_clk_init;
>>>>>>
>>>>>>
>>>>>> I'm concerned that this won't scale in the future. Perhaps a better
>>>>>> solution would be to add a "flags" or "features" field to struct
>>>>>> nouveau_device along with feature bits such as HAS_CALIBRATION or
>>>>>> similar.
>>>>>>
>>>>>> That way we don't have to touch this code for every new future Tegra
>>>>>> chip. Unless perhaps if there's a reason to expect things to change in
>>>>>> newer generations.
>>>>>
>>>>> I've already handled this in a slightly different way in the tree I'd
>>>>> previously pointed Alex at (I think!), as I needed to do the same for
>>>>> GM107.
>>>>>
>>>>> Should just be able to use that implementation (so, just change the
>>>>> probe patch) here too.
>>>>
>>>>
>>>> I will skip this patch and use your implementation then. Btw,
>>>> shouldn't the source file for the GK20A implementation be named nvea.c
>>>> instead of gk20a.c?
>>>
>>> For the Maxwell stuff I've been using "gm107" now too.  Since we're
>>> working with you guys these days it seems better to use the same names
>>> for things ;)
>>
>>
>> So would you like us to use the same naming scheme as well? So far all my
>> patches use "nvea.c" whenever I need to add code.
> If it's not too much of a problem at this point, then that'd be good.
> Right before I send -next for the next merge window I'll likely do a
> mass rename anyway, so if we can get your patches merged before then
> (which would be really good!), it doesn't matter much.

No problem, I will update the naming to follow what you did with the
timer driver and gm107.

Hopefully I will soon manage to carve out some time to rebase these
patches and send v2.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()
  2014-03-26  4:27           ` Ben Skeggs
@ 2014-04-16  5:57               ` Alexandre Courbot
  -1 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-16  5:57 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Wed, Mar 26, 2014 at 1:27 PM, Ben Skeggs <skeggsb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Tue, Mar 25, 2014 at 9:10 AM, Thierry Reding
> <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> On Mon, Mar 24, 2014 at 05:42:33PM +0900, Alexandre Courbot wrote:
>>> GK20A does not embed a dedicated COPY engine and thus cannot allocate
>>> the copy channel that nouveau_accel_init() attempts to create. It also
>>> lacks any display hardware, so the creation of a software channel does
>>> not apply neither.
>>
>> Perhaps this should be two separate patches?
>>
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
>> [...]
>>> +     if (device->chipset == 0xea) {
>>> +             /* gk20a does not have CE0/CE1 */
>>
>> This would be another good candidate for a feature flag.
> There are ways to query this in a chipset-independent way.  However,
> despite reporting it as an error if no copy engines are available, the
> code should continue on without the channel happily.  Perhaps we can
> just punt the relevent error messages to a debug loglevel for now?

Do you know how to query this in a chipset-independant way? I have
failed to find any information for this.

The code does continue without any issue after reporting the error, so
indeed that check is not strictly necessary. But I was just mimicking
what follows right after:

    if (device->chipset >= 0xa3 &&
        device->chipset != 0xaa &&
        device->chipset != 0xac) {
        ret = nouveau_channel_new(drm, &drm->client, NVDRM_DEVICE,
                      NVDRM_CHAN + 1, NvDmaFB, NvDmaTT,
                      &drm->cechan);
        if (ret)
            NV_ERROR(drm, "failed to create ce channel, %d\n", ret);

        arg0 = NvDmaFB;
        arg1 = NvDmaTT;
    } else {
        arg0 = NvDmaFB;
        arg1 = NvDmaTT;
    }

So if we are trying to avoid showing this error for 0xa0 class
devices, why not for NV_E0?

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init()
@ 2014-04-16  5:57               ` Alexandre Courbot
  0 siblings, 0 replies; 100+ messages in thread
From: Alexandre Courbot @ 2014-04-16  5:57 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: Thierry Reding, Alexandre Courbot, nouveau, linux-kernel,
	dri-devel, Ben Skeggs, linux-tegra

On Wed, Mar 26, 2014 at 1:27 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
> On Tue, Mar 25, 2014 at 9:10 AM, Thierry Reding
> <thierry.reding@gmail.com> wrote:
>> On Mon, Mar 24, 2014 at 05:42:33PM +0900, Alexandre Courbot wrote:
>>> GK20A does not embed a dedicated COPY engine and thus cannot allocate
>>> the copy channel that nouveau_accel_init() attempts to create. It also
>>> lacks any display hardware, so the creation of a software channel does
>>> not apply neither.
>>
>> Perhaps this should be two separate patches?
>>
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
>> [...]
>>> +     if (device->chipset == 0xea) {
>>> +             /* gk20a does not have CE0/CE1 */
>>
>> This would be another good candidate for a feature flag.
> There are ways to query this in a chipset-independent way.  However,
> despite reporting it as an error if no copy engines are available, the
> code should continue on without the channel happily.  Perhaps we can
> just punt the relevent error messages to a debug loglevel for now?

Do you know how to query this in a chipset-independant way? I have
failed to find any information for this.

The code does continue without any issue after reporting the error, so
indeed that check is not strictly necessary. But I was just mimicking
what follows right after:

    if (device->chipset >= 0xa3 &&
        device->chipset != 0xaa &&
        device->chipset != 0xac) {
        ret = nouveau_channel_new(drm, &drm->client, NVDRM_DEVICE,
                      NVDRM_CHAN + 1, NvDmaFB, NvDmaTT,
                      &drm->cechan);
        if (ret)
            NV_ERROR(drm, "failed to create ce channel, %d\n", ret);

        arg0 = NvDmaFB;
        arg1 = NvDmaTT;
    } else {
        arg0 = NvDmaFB;
        arg1 = NvDmaTT;
    }

So if we are trying to avoid showing this error for 0xa0 class
devices, why not for NV_E0?

^ permalink raw reply	[flat|nested] 100+ messages in thread

end of thread, other threads:[~2014-04-16  5:58 UTC | newest]

Thread overview: 100+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-24  8:42 [PATCH 00/12] drm/nouveau: support for GK20A, cont'd Alexandre Courbot
2014-03-24  8:42 ` Alexandre Courbot
2014-03-24  8:42 ` [PATCH 03/12] drm/nouveau/bar: only ioremap BAR3 if it exists Alexandre Courbot
2014-03-24  8:42   ` Alexandre Courbot
     [not found]   ` <1395650554-31925-4-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-03-24 22:13     ` Thierry Reding
2014-03-24 22:13       ` Thierry Reding
2014-03-26  4:20       ` Ben Skeggs
2014-03-26  4:20         ` [Nouveau] " Ben Skeggs
2014-03-24  8:42 ` [PATCH 04/12] drm/nouveau/bar/nvc0: support chips without BAR3 Alexandre Courbot
2014-03-24  8:42   ` Alexandre Courbot
     [not found]   ` <1395650554-31925-5-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-03-24 22:10     ` Thierry Reding
2014-03-24 22:10       ` Thierry Reding
2014-04-02 13:47       ` Alexandre Courbot
2014-04-02 13:47         ` Alexandre Courbot
2014-03-24  8:42 ` [PATCH 05/12] drm/nouveau/fifo: add GK20A support Alexandre Courbot
2014-03-24  8:42   ` Alexandre Courbot
     [not found]   ` <1395650554-31925-6-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-03-24 22:14     ` Thierry Reding
2014-03-24 22:14       ` Thierry Reding
     [not found] ` <1395650554-31925-1-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-03-24  8:42   ` [PATCH 01/12] drm/nouveau: fix missing newline Alexandre Courbot
2014-03-24  8:42     ` Alexandre Courbot
     [not found]     ` <1395650554-31925-2-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-03-24 21:49       ` Thierry Reding
2014-03-24 21:49         ` Thierry Reding
2014-03-24  8:42   ` [PATCH 02/12] drm/nouveau/timer: skip calibration on GK20A Alexandre Courbot
2014-03-24  8:42     ` Alexandre Courbot
     [not found]     ` <1395650554-31925-3-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-03-24 21:54       ` Thierry Reding
2014-03-24 21:54         ` Thierry Reding
2014-03-26  4:19         ` Ben Skeggs
2014-03-26  4:19           ` Ben Skeggs
     [not found]           ` <CACAvsv54BF3cD6y59=kQ=1j-S9tGixTz_J3gx2mbOEkB2eQ4cA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-04-11  2:46             ` Alexandre Courbot
2014-04-11  2:46               ` Alexandre Courbot
     [not found]               ` <CAAVeFuK-6RAfmZ8Zi3+bX_ibmPxPKEy7oT9YiLJWXpLMx3PsFw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-04-11  7:31                 ` Ben Skeggs
2014-04-11  7:31                   ` Ben Skeggs
     [not found]                   ` <CACAvsv7Z7OfEx89FYm1F_+Lpb2q33Ay3zph3nRdNCj3PO=vR8w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-04-11  7:34                     ` Alexandre Courbot
2014-04-11  7:34                       ` Alexandre Courbot
2014-04-14  8:35                       ` Ben Skeggs
2014-04-14  8:35                         ` Ben Skeggs
     [not found]                         ` <CACAvsv7gsRrXAk-=M08hjvhrRWvi80wrgRAp1M5T66zMa5ydtA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-04-15  6:10                           ` Alexandre Courbot
2014-04-15  6:10                             ` Alexandre Courbot
2014-03-24  8:42   ` [PATCH 06/12] drm/nouveau/ibus: add GK20A support Alexandre Courbot
2014-03-24  8:42     ` Alexandre Courbot
     [not found]     ` <1395650554-31925-7-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-03-24 22:34       ` Thierry Reding
2014-03-24 22:34         ` Thierry Reding
2014-04-02 13:52         ` Alexandre Courbot
2014-04-02 13:52           ` Alexandre Courbot
     [not found]           ` <CAAVeFuKezGqPnmi=XJW1FrL-3e8hyqdjuY83VEh496SLm+0mFQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-04-02 14:18             ` [Nouveau] " Ilia Mirkin
2014-04-02 14:18               ` Ilia Mirkin
     [not found]               ` <CAKb7Uvj_+s+PTbziGD_jb=ryk372vgxSTOrmxOZVGUghHnveAw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-04-02 14:22                 ` Alexandre Courbot
2014-04-02 14:22                   ` [Nouveau] " Alexandre Courbot
2014-03-24  8:42   ` [PATCH 07/12] drm/nouveau/fb: " Alexandre Courbot
2014-03-24  8:42     ` Alexandre Courbot
2014-03-24  8:42   ` [PATCH 08/12] drm/nouveau/graph: enable when using external firmware Alexandre Courbot
2014-03-24  8:42     ` Alexandre Courbot
     [not found]     ` <1395650554-31925-9-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-03-24 22:58       ` Thierry Reding
2014-03-24 22:58         ` Thierry Reding
2014-03-26  4:21         ` Ben Skeggs
2014-03-26  4:21           ` Ben Skeggs
2014-04-02 13:53           ` Alexandre Courbot
2014-04-02 13:53             ` Alexandre Courbot
2014-03-24  8:42   ` [PATCH 09/12] drm/nouveau/graph: pad firmware code at load time Alexandre Courbot
2014-03-24  8:42     ` Alexandre Courbot
     [not found]     ` <1395650554-31925-10-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-03-24 23:02       ` Thierry Reding
2014-03-24 23:02         ` Thierry Reding
2014-03-26  4:22       ` Ben Skeggs
2014-03-26  4:22         ` [Nouveau] " Ben Skeggs
     [not found]         ` <CACAvsv6eCoWEU7ZNd5158-V3n=ZVo5O04CZa2EWE78e_tZEXPA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-04-02 13:54           ` Alexandre Courbot
2014-04-02 13:54             ` Alexandre Courbot
2014-03-24  8:42   ` [PATCH 10/12] drm/nouveau/graph: add GK20A support Alexandre Courbot
2014-03-24  8:42     ` Alexandre Courbot
     [not found]     ` <1395650554-31925-11-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-03-26  4:24       ` Ben Skeggs
2014-03-26  4:24         ` Ben Skeggs
2014-04-02 14:03         ` Alexandre Courbot
2014-04-02 14:03           ` Alexandre Courbot
2014-04-02 23:11           ` Ben Skeggs
2014-04-02 23:11             ` Ben Skeggs
2014-03-24  8:42   ` [PATCH 11/12] drm/nouveau: support GK20A in nouveau_accel_init() Alexandre Courbot
2014-03-24  8:42     ` Alexandre Courbot
     [not found]     ` <1395650554-31925-12-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-03-24 23:10       ` Thierry Reding
2014-03-24 23:10         ` Thierry Reding
2014-03-26  4:27         ` Ben Skeggs
2014-03-26  4:27           ` Ben Skeggs
2014-04-02 14:14           ` Alexandre Courbot
2014-04-02 14:23             ` [Nouveau] " Ilia Mirkin
2014-04-02 23:14               ` Ben Skeggs
     [not found]           ` <CACAvsv4FHqwAF6FOaMF1sQLwAfBOd6uFgsZ+VdwCHoR4qsVbVw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-04-16  5:57             ` Alexandre Courbot
2014-04-16  5:57               ` Alexandre Courbot
2014-03-24  8:42   ` [PATCH 12/12] drm/nouveau: support for probing GK20A Alexandre Courbot
2014-03-24  8:42     ` Alexandre Courbot
     [not found]     ` <1395650554-31925-13-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-03-24 23:11       ` Thierry Reding
2014-03-24 23:11         ` Thierry Reding
2014-03-26  4:28     ` [Nouveau] " Ben Skeggs
2014-03-26  4:28       ` Ben Skeggs
2014-04-02 14:19       ` Alexandre Courbot
2014-04-02 14:19         ` Alexandre Courbot
2014-03-24 13:19 ` [PATCH 00/12] drm/nouveau: support for GK20A, cont'd Lucas Stach
     [not found]   ` <1395667191.5062.18.camel-WzVe3FnzCwFR6QfukMTsflXZhhPuCNm+@public.gmane.org>
2014-03-26  6:33     ` Alexandre Courbot
2014-03-26  6:33       ` Alexandre Courbot
     [not found]       ` <CAAVeFu+s+hrKdiHqxvxAOSnq-Jm_g3PY1wT3TgwpX1OGP-Niog-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-03-26 10:33         ` Lucas Stach
2014-03-26 10:33           ` Lucas Stach
     [not found]           ` <1395830031.7930.8.camel-WzVe3FnzCwFR6QfukMTsflXZhhPuCNm+@public.gmane.org>
2014-03-27  3:50             ` Alexandre Courbot
2014-03-27  3:50               ` Alexandre Courbot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.