[PATCH 00/11] drm/nouveau: Enable GP10B by default

* [PATCH 00/11] drm/nouveau: Enable GP10B by default
@ 2019-09-16 15:04 Thierry Reding
  2019-09-16 15:04 ` [PATCH 03/11] drm/nouveau: secboot: Read WPR configuration from GPU registers Thierry Reding
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Thierry Reding @ 2019-09-16 15:04 UTC (permalink / raw)
  To: Ben Skeggs, Thierry Reding
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

From: Thierry Reding <treding@nvidia.com>

Hi,

the GPU on Jetson TX2 (GP10B) does not work properly on all devices. Why
exactly is not clear, but there are slight differences between the SKUs
that were tested. It turns out that the biggest issue is that on some
devices (e.g. the one that I have), pulsing the GPU reset twice as is
done in the current code (once as part of the power-ungate operation and
then again in the driver) causes the GPU to go into a bad state on some
devices. Conditionally doing the reset in the driver only if it isn't
already done by the power domain code fixes this issue.

Another issue is that the clock may be running at a rate of 0 Hz. This
is unlikely to happen because it internally actually can't run that
slow, but explicitly setting the clock rate at probe time does seem to
help in some cases.

Patch three in this series unifies reading the WPR configuration by
getting it from GPU register rather than reaching into the memory
controller's register space. This is slightly better because it better
separates the two drivers and doesn't require an update everytime the
memory controller moves to another register aperture.

Patch 4 ensures the L2 cache makes memory requests with the proper
stream ID, which is required when the GPU is behind an IOMMU.

Patch 5 changes the GP10B device initialization to use the correct copy
engine. GP10B is a Pascal generation GPU and the way that engines are
described changes how the copy engines are enumerated compared to
earlier generations.

Patches 6 through 9 allow Nouveau to work on Tegra GPUs if the DMA API
is backed by an IOMMU. This is different from current assumptions
because mappings for all buffers mapped through the DMA API will need
to have the special IOMMU bit set in their page tables. Note that this
technically makes it possible to support big pages on Tegra because from
the GPU's point of view all memory is now contiguous. However, these
patches only make sure that buffers are mapped properly and don't try to
enable big pages. Also note that mapping through the IOMMU comes at a
slight cost, so this may not always be desirable. However, with Tegra186
and later it's currently not possible (from a DMA API point of view) to
map only a subset of buffers through the IOMMU, so any such optimization
is deferred. Furthermore, the ARM SMMU driver currently enforces the use
of the SMMU by default, so there not much of a choice at the moment.

Finally patches 10 and 11 enable the GPU on Jetson TX2 and make it use
the SMMU. I can pick up patches 10 and 11 into the Tegra tree once the
other patches have been merged into Nouveau.

Thierry

Alexandre Courbot (1):
  arm64: tegra: Enable GPU on Jetson TX2

Thierry Reding (10):
  drm/nouveau: tegra: Avoid pulsing reset twice
  drm/nouveau: tegra: Set clock rate if not set
  drm/nouveau: secboot: Read WPR configuration from GPU registers
  drm/nouveau: gp10b: Add custom L2 cache implementation
  drm/nouveau: gp10b: Use correct copy engine
  drm/nouveau: gk20a: Set IOMMU bit for DMA API if appropriate
  drm/nouveau: gk20a: Implement custom MMU class
  drm/nouveau: tegra: Skip IOMMU initialization if already attached
  drm/nouveau: tegra: Fall back to 32-bit DMA mask without IOMMU
  arm64: tegra: Enable SMMU for GPU on Tegra186

 .../boot/dts/nvidia/tegra186-p2771-0000.dts   |   4 +
 arch/arm64/boot/dts/nvidia/tegra186.dtsi      |   1 +
 .../gpu/drm/nouveau/include/nvkm/subdev/ltc.h |   1 +
 .../gpu/drm/nouveau/nvkm/engine/device/base.c |   4 +-
 .../drm/nouveau/nvkm/engine/device/tegra.c    | 152 +++++++++++-------
 .../drm/nouveau/nvkm/subdev/instmem/gk20a.c   |  35 ++--
 .../gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild    |   1 +
 .../gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c   |  69 ++++++++
 .../gpu/drm/nouveau/nvkm/subdev/ltc/priv.h    |   2 +
 .../gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.c   |  50 +++++-
 .../gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.h   |  44 +++++
 .../gpu/drm/nouveau/nvkm/subdev/mmu/gm20b.c   |   6 +-
 .../gpu/drm/nouveau/nvkm/subdev/mmu/gp10b.c   |   4 +-
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h |   1 +
 .../drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c    |  22 ++-
 .../drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c    |   4 +-
 .../drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c    |  20 ++-
 .../drm/nouveau/nvkm/subdev/secboot/gm200.h   |   2 +-
 .../drm/nouveau/nvkm/subdev/secboot/gm20b.c   |  81 ++++++----
 .../drm/nouveau/nvkm/subdev/secboot/gp10b.c   |   4 +-
 20 files changed, 394 insertions(+), 113 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.h

-- 
2.23.0

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 23+ messages in thread