All of lore.kernel.org
 help / color / mirror / Atom feed
* x86 PAT memtype regression fixes (with extra cc's)
@ 2016-10-24  6:31 Dave Airlie
  2016-10-24  6:31 ` [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1) Dave Airlie
  2016-10-24  6:31 ` [PATCH 2/2] drm/drivers: add support for using the arch wc mapping API Dave Airlie
  0 siblings, 2 replies; 15+ messages in thread
From: Dave Airlie @ 2016-10-24  6:31 UTC (permalink / raw)
  To: mcgrof, torvalds, dan.j.williams, x86; +Cc: linux-kernel, dri-devel

As per Ingo's request I've cc'ed a bunch more x86/PAT people.

Dave.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1)
  2016-10-24  6:31 x86 PAT memtype regression fixes (with extra cc's) Dave Airlie
@ 2016-10-24  6:31 ` Dave Airlie
  2016-10-25  9:40   ` Ingo Molnar
                     ` (3 more replies)
  2016-10-24  6:31 ` [PATCH 2/2] drm/drivers: add support for using the arch wc mapping API Dave Airlie
  1 sibling, 4 replies; 15+ messages in thread
From: Dave Airlie @ 2016-10-24  6:31 UTC (permalink / raw)
  To: mcgrof, torvalds, dan.j.williams, x86
  Cc: linux-kernel, dri-devel, Dave Airlie, Toshi Kani,
	Borislav Petkov, H. Peter Anvin, Andy Lutomirski, Denys Vlasenko,
	Brian Gerst

A recent change to the mm code in:
87744ab3832b83ba71b931f86f9cfdb000d07da5
mm: fix cache mode tracking in vm_insert_mixed()

started enforcing checking the memory type against the registered list for
amixed pfn insertion mappings. It happens that the drm drivers for a number
of gpus relied on this being broken. Currently the driver only inserted
VRAM mappings into the tracking table when they came from the kernel,
and userspace mappings never landed in the table. This led to a regression
where all the mapping end up as UC instead of WC now.

I've considered a number of solutions but since this needs to be fixed
in fixes and not next, and some of the solutions were going to introduce
overhead that hadn't been there before I didn't consider them viable at
this stage. These mainly concerned hooking into the TTM io reserve APIs,
but these API have a bunch of fast paths I didn't want to unwind to add
this to.

The solution I've decided on is to add a new API like the arch_phys_wc
APIs (these would have worked but wc_del didn't take a range), and
use them from the drivers to add a WC compatible mapping to the table
for all VRAM on those GPUs. This means we can then create userspace
mapping that won't get degraded to UC.

v1.1: use CONFIG_X86_PAT
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: x86@kernel.org
Cc: mcgrof@suse.com
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
---
 arch/x86/include/asm/io.h |  6 ++++++
 arch/x86/mm/pat.c         | 13 +++++++++++++
 include/linux/io.h        | 13 +++++++++++++
 3 files changed, 32 insertions(+)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index de25aad..d34bd37 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -351,4 +351,10 @@ extern void arch_phys_wc_del(int handle);
 #define arch_phys_wc_add arch_phys_wc_add
 #endif
 
+#ifdef CONFIG_X86_PAT
+extern int arch_io_reserve_memtype_wc(resource_size_t start, resource_size_t size);
+extern void arch_io_free_memtype_wc(resource_size_t start, resource_size_t size);
+#define arch_io_reserve_memtype_wc arch_io_reserve_memtype_wc
+#endif
+
 #endif /* _ASM_X86_IO_H */
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 170cc4f..49d1b75 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -730,6 +730,19 @@ void io_free_memtype(resource_size_t start, resource_size_t end)
 	free_memtype(start, end);
 }
 
+int arch_io_reserve_memtype_wc(resource_size_t start, resource_size_t size)
+{
+	enum page_cache_mode type = _PAGE_CACHE_MODE_WC;
+	return io_reserve_memtype(start, start + size, &type);
+}
+EXPORT_SYMBOL(arch_io_reserve_memtype_wc);
+
+void arch_io_free_memtype_wc(resource_size_t start, resource_size_t size)
+{
+	io_free_memtype(start, start + size);
+}
+EXPORT_SYMBOL(arch_io_free_memtype_wc);
+
 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 				unsigned long size, pgprot_t vma_prot)
 {
diff --git a/include/linux/io.h b/include/linux/io.h
index e2c8419..963ab71 100644
--- a/include/linux/io.h
+++ b/include/linux/io.h
@@ -141,4 +141,17 @@ enum {
 void *memremap(resource_size_t offset, size_t size, unsigned long flags);
 void memunmap(void *addr);
 
+#ifndef arch_io_reserve_memtype_wc
+static inline int arch_io_reserve_memtype_wc(resource_size_t base,
+					     resource_size_t size)
+{
+	return 0;
+}
+
+static inline void arch_io_free_memtype_wc(resource_size_t base,
+					   resource_size_t size)
+{
+}
+#endif
+
 #endif /* _LINUX_IO_H */
-- 
2.5.5

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/2] drm/drivers: add support for using the arch wc mapping API.
  2016-10-24  6:31 x86 PAT memtype regression fixes (with extra cc's) Dave Airlie
  2016-10-24  6:31 ` [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1) Dave Airlie
@ 2016-10-24  6:31 ` Dave Airlie
  2016-10-24  9:24   ` Christian König
  1 sibling, 1 reply; 15+ messages in thread
From: Dave Airlie @ 2016-10-24  6:31 UTC (permalink / raw)
  To: mcgrof, torvalds, dan.j.williams, x86
  Cc: linux-kernel, dri-devel, Dave Airlie

This fixes a regression in all these drivers since the cache
mode tracking was fixed for mixed mappings. It uses the new
arch API to add the VRAM range to the PAT mapping tracking
tables.

Fixes: 87744ab3832 (mm: fix cache mode tracking in vm_insert_mixed())
Signed-off-by: Dave Airlie <airlied@redhat.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 +++++
 drivers/gpu/drm/ast/ast_ttm.c              | 6 ++++++
 drivers/gpu/drm/cirrus/cirrus_ttm.c        | 7 +++++++
 drivers/gpu/drm/mgag200/mgag200_ttm.c      | 7 +++++++
 drivers/gpu/drm/nouveau/nouveau_ttm.c      | 8 ++++++++
 drivers/gpu/drm/radeon/radeon_object.c     | 5 +++++
 6 files changed, 38 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index aa074fa..f3efb1c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -754,6 +754,10 @@ static const char *amdgpu_vram_names[] = {
 
 int amdgpu_bo_init(struct amdgpu_device *adev)
 {
+	/* reserve PAT memory space to WC for VRAM */
+	arch_io_reserve_memtype_wc(adev->mc.aper_base,
+				   adev->mc.aper_size);
+
 	/* Add an MTRR for the VRAM */
 	adev->mc.vram_mtrr = arch_phys_wc_add(adev->mc.aper_base,
 					      adev->mc.aper_size);
@@ -769,6 +773,7 @@ void amdgpu_bo_fini(struct amdgpu_device *adev)
 {
 	amdgpu_ttm_fini(adev);
 	arch_phys_wc_del(adev->mc.vram_mtrr);
+	arch_io_free_memtype_wc(adev->mc.aper_base, adev->mc.aper_size);
 }
 
 int amdgpu_bo_fbdev_mmap(struct amdgpu_bo *bo,
diff --git a/drivers/gpu/drm/ast/ast_ttm.c b/drivers/gpu/drm/ast/ast_ttm.c
index 608df4c..0743e65 100644
--- a/drivers/gpu/drm/ast/ast_ttm.c
+++ b/drivers/gpu/drm/ast/ast_ttm.c
@@ -267,6 +267,8 @@ int ast_mm_init(struct ast_private *ast)
 		return ret;
 	}
 
+	arch_io_reserve_memtype_wc(pci_resource_start(dev->pdev, 0),
+				   pci_resource_len(dev->pdev, 0));
 	ast->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0),
 					pci_resource_len(dev->pdev, 0));
 
@@ -275,11 +277,15 @@ int ast_mm_init(struct ast_private *ast)
 
 void ast_mm_fini(struct ast_private *ast)
 {
+	struct drm_device *dev = ast->dev;
+
 	ttm_bo_device_release(&ast->ttm.bdev);
 
 	ast_ttm_global_release(ast);
 
 	arch_phys_wc_del(ast->fb_mtrr);
+	arch_io_free_memtype_wc(pci_resource_start(dev->pdev, 0),
+				pci_resource_len(dev->pdev, 0));
 }
 
 void ast_ttm_placement(struct ast_bo *bo, int domain)
diff --git a/drivers/gpu/drm/cirrus/cirrus_ttm.c b/drivers/gpu/drm/cirrus/cirrus_ttm.c
index bb2438d..5e7e63c 100644
--- a/drivers/gpu/drm/cirrus/cirrus_ttm.c
+++ b/drivers/gpu/drm/cirrus/cirrus_ttm.c
@@ -267,6 +267,9 @@ int cirrus_mm_init(struct cirrus_device *cirrus)
 		return ret;
 	}
 
+	arch_io_reserve_memtype_wc(pci_resource_start(dev->pdev, 0),
+				   pci_resource_len(dev->pdev, 0));
+
 	cirrus->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0),
 					   pci_resource_len(dev->pdev, 0));
 
@@ -276,6 +279,8 @@ int cirrus_mm_init(struct cirrus_device *cirrus)
 
 void cirrus_mm_fini(struct cirrus_device *cirrus)
 {
+	struct drm_device *dev = cirrus->dev;
+
 	if (!cirrus->mm_inited)
 		return;
 
@@ -285,6 +290,8 @@ void cirrus_mm_fini(struct cirrus_device *cirrus)
 
 	arch_phys_wc_del(cirrus->fb_mtrr);
 	cirrus->fb_mtrr = 0;
+	arch_io_free_memtype_wc(pci_resource_start(dev->pdev, 0),
+				pci_resource_len(dev->pdev, 0));
 }
 
 void cirrus_ttm_placement(struct cirrus_bo *bo, int domain)
diff --git a/drivers/gpu/drm/mgag200/mgag200_ttm.c b/drivers/gpu/drm/mgag200/mgag200_ttm.c
index 919b35f..dcf7d11 100644
--- a/drivers/gpu/drm/mgag200/mgag200_ttm.c
+++ b/drivers/gpu/drm/mgag200/mgag200_ttm.c
@@ -266,6 +266,9 @@ int mgag200_mm_init(struct mga_device *mdev)
 		return ret;
 	}
 
+	arch_io_reserve_memtype_wc(pci_resource_start(dev->pdev, 0),
+				   pci_resource_len(dev->pdev, 0));
+
 	mdev->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0),
 					 pci_resource_len(dev->pdev, 0));
 
@@ -274,10 +277,14 @@ int mgag200_mm_init(struct mga_device *mdev)
 
 void mgag200_mm_fini(struct mga_device *mdev)
 {
+	struct drm_device *dev = mdev->dev;
+
 	ttm_bo_device_release(&mdev->ttm.bdev);
 
 	mgag200_ttm_global_release(mdev);
 
+	arch_io_free_memtype_wc(pci_resource_start(dev->pdev, 0),
+				pci_resource_len(dev->pdev, 0));
 	arch_phys_wc_del(mdev->fb_mtrr);
 	mdev->fb_mtrr = 0;
 }
diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
index 1825dbc..a6dbe82 100644
--- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
@@ -398,6 +398,9 @@ nouveau_ttm_init(struct nouveau_drm *drm)
 	/* VRAM init */
 	drm->gem.vram_available = drm->device.info.ram_user;
 
+	arch_io_reserve_memtype_wc(device->func->resource_addr(device, 1),
+				   device->func->resource_size(device, 1));
+
 	ret = ttm_bo_init_mm(&drm->ttm.bdev, TTM_PL_VRAM,
 			      drm->gem.vram_available >> PAGE_SHIFT);
 	if (ret) {
@@ -430,6 +433,8 @@ nouveau_ttm_init(struct nouveau_drm *drm)
 void
 nouveau_ttm_fini(struct nouveau_drm *drm)
 {
+	struct nvkm_device *device = nvxx_device(&drm->device);
+
 	ttm_bo_clean_mm(&drm->ttm.bdev, TTM_PL_VRAM);
 	ttm_bo_clean_mm(&drm->ttm.bdev, TTM_PL_TT);
 
@@ -439,4 +444,7 @@ nouveau_ttm_fini(struct nouveau_drm *drm)
 
 	arch_phys_wc_del(drm->ttm.mtrr);
 	drm->ttm.mtrr = 0;
+	arch_io_free_memtype_wc(device->func->resource_addr(device, 1),
+				device->func->resource_size(device, 1));
+
 }
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index be30861..41b72ce 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -446,6 +446,10 @@ void radeon_bo_force_delete(struct radeon_device *rdev)
 
 int radeon_bo_init(struct radeon_device *rdev)
 {
+	/* reserve PAT memory space to WC for VRAM */
+	arch_io_reserve_memtype_wc(rdev->mc.aper_base,
+				   rdev->mc.aper_size);
+
 	/* Add an MTRR for the VRAM */
 	if (!rdev->fastfb_working) {
 		rdev->mc.vram_mtrr = arch_phys_wc_add(rdev->mc.aper_base,
@@ -463,6 +467,7 @@ void radeon_bo_fini(struct radeon_device *rdev)
 {
 	radeon_ttm_fini(rdev);
 	arch_phys_wc_del(rdev->mc.vram_mtrr);
+	arch_io_free_memtype_wc(rdev->mc.aper_base, rdev->mc.aper_size);
 }
 
 /* Returns how many bytes TTM can move per IB.
-- 
2.5.5

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/2] drm/drivers: add support for using the arch wc mapping API.
@ 2016-10-24  6:31 ` Dave Airlie
  2016-10-24  9:24   ` Christian König
  0 siblings, 1 reply; 15+ messages in thread
From: Dave Airlie @ 2016-10-24  6:31 UTC (permalink / raw)
  To: mcgrof, torvalds, dan.j.williams, x86
  Cc: Dave Airlie, linux-kernel, dri-devel

This fixes a regression in all these drivers since the cache
mode tracking was fixed for mixed mappings. It uses the new
arch API to add the VRAM range to the PAT mapping tracking
tables.

Fixes: 87744ab3832 (mm: fix cache mode tracking in vm_insert_mixed())
Signed-off-by: Dave Airlie <airlied@redhat.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 +++++
 drivers/gpu/drm/ast/ast_ttm.c              | 6 ++++++
 drivers/gpu/drm/cirrus/cirrus_ttm.c        | 7 +++++++
 drivers/gpu/drm/mgag200/mgag200_ttm.c      | 7 +++++++
 drivers/gpu/drm/nouveau/nouveau_ttm.c      | 8 ++++++++
 drivers/gpu/drm/radeon/radeon_object.c     | 5 +++++
 6 files changed, 38 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index aa074fa..f3efb1c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -754,6 +754,10 @@ static const char *amdgpu_vram_names[] = {
 
 int amdgpu_bo_init(struct amdgpu_device *adev)
 {
+	/* reserve PAT memory space to WC for VRAM */
+	arch_io_reserve_memtype_wc(adev->mc.aper_base,
+				   adev->mc.aper_size);
+
 	/* Add an MTRR for the VRAM */
 	adev->mc.vram_mtrr = arch_phys_wc_add(adev->mc.aper_base,
 					      adev->mc.aper_size);
@@ -769,6 +773,7 @@ void amdgpu_bo_fini(struct amdgpu_device *adev)
 {
 	amdgpu_ttm_fini(adev);
 	arch_phys_wc_del(adev->mc.vram_mtrr);
+	arch_io_free_memtype_wc(adev->mc.aper_base, adev->mc.aper_size);
 }
 
 int amdgpu_bo_fbdev_mmap(struct amdgpu_bo *bo,
diff --git a/drivers/gpu/drm/ast/ast_ttm.c b/drivers/gpu/drm/ast/ast_ttm.c
index 608df4c..0743e65 100644
--- a/drivers/gpu/drm/ast/ast_ttm.c
+++ b/drivers/gpu/drm/ast/ast_ttm.c
@@ -267,6 +267,8 @@ int ast_mm_init(struct ast_private *ast)
 		return ret;
 	}
 
+	arch_io_reserve_memtype_wc(pci_resource_start(dev->pdev, 0),
+				   pci_resource_len(dev->pdev, 0));
 	ast->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0),
 					pci_resource_len(dev->pdev, 0));
 
@@ -275,11 +277,15 @@ int ast_mm_init(struct ast_private *ast)
 
 void ast_mm_fini(struct ast_private *ast)
 {
+	struct drm_device *dev = ast->dev;
+
 	ttm_bo_device_release(&ast->ttm.bdev);
 
 	ast_ttm_global_release(ast);
 
 	arch_phys_wc_del(ast->fb_mtrr);
+	arch_io_free_memtype_wc(pci_resource_start(dev->pdev, 0),
+				pci_resource_len(dev->pdev, 0));
 }
 
 void ast_ttm_placement(struct ast_bo *bo, int domain)
diff --git a/drivers/gpu/drm/cirrus/cirrus_ttm.c b/drivers/gpu/drm/cirrus/cirrus_ttm.c
index bb2438d..5e7e63c 100644
--- a/drivers/gpu/drm/cirrus/cirrus_ttm.c
+++ b/drivers/gpu/drm/cirrus/cirrus_ttm.c
@@ -267,6 +267,9 @@ int cirrus_mm_init(struct cirrus_device *cirrus)
 		return ret;
 	}
 
+	arch_io_reserve_memtype_wc(pci_resource_start(dev->pdev, 0),
+				   pci_resource_len(dev->pdev, 0));
+
 	cirrus->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0),
 					   pci_resource_len(dev->pdev, 0));
 
@@ -276,6 +279,8 @@ int cirrus_mm_init(struct cirrus_device *cirrus)
 
 void cirrus_mm_fini(struct cirrus_device *cirrus)
 {
+	struct drm_device *dev = cirrus->dev;
+
 	if (!cirrus->mm_inited)
 		return;
 
@@ -285,6 +290,8 @@ void cirrus_mm_fini(struct cirrus_device *cirrus)
 
 	arch_phys_wc_del(cirrus->fb_mtrr);
 	cirrus->fb_mtrr = 0;
+	arch_io_free_memtype_wc(pci_resource_start(dev->pdev, 0),
+				pci_resource_len(dev->pdev, 0));
 }
 
 void cirrus_ttm_placement(struct cirrus_bo *bo, int domain)
diff --git a/drivers/gpu/drm/mgag200/mgag200_ttm.c b/drivers/gpu/drm/mgag200/mgag200_ttm.c
index 919b35f..dcf7d11 100644
--- a/drivers/gpu/drm/mgag200/mgag200_ttm.c
+++ b/drivers/gpu/drm/mgag200/mgag200_ttm.c
@@ -266,6 +266,9 @@ int mgag200_mm_init(struct mga_device *mdev)
 		return ret;
 	}
 
+	arch_io_reserve_memtype_wc(pci_resource_start(dev->pdev, 0),
+				   pci_resource_len(dev->pdev, 0));
+
 	mdev->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0),
 					 pci_resource_len(dev->pdev, 0));
 
@@ -274,10 +277,14 @@ int mgag200_mm_init(struct mga_device *mdev)
 
 void mgag200_mm_fini(struct mga_device *mdev)
 {
+	struct drm_device *dev = mdev->dev;
+
 	ttm_bo_device_release(&mdev->ttm.bdev);
 
 	mgag200_ttm_global_release(mdev);
 
+	arch_io_free_memtype_wc(pci_resource_start(dev->pdev, 0),
+				pci_resource_len(dev->pdev, 0));
 	arch_phys_wc_del(mdev->fb_mtrr);
 	mdev->fb_mtrr = 0;
 }
diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
index 1825dbc..a6dbe82 100644
--- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
@@ -398,6 +398,9 @@ nouveau_ttm_init(struct nouveau_drm *drm)
 	/* VRAM init */
 	drm->gem.vram_available = drm->device.info.ram_user;
 
+	arch_io_reserve_memtype_wc(device->func->resource_addr(device, 1),
+				   device->func->resource_size(device, 1));
+
 	ret = ttm_bo_init_mm(&drm->ttm.bdev, TTM_PL_VRAM,
 			      drm->gem.vram_available >> PAGE_SHIFT);
 	if (ret) {
@@ -430,6 +433,8 @@ nouveau_ttm_init(struct nouveau_drm *drm)
 void
 nouveau_ttm_fini(struct nouveau_drm *drm)
 {
+	struct nvkm_device *device = nvxx_device(&drm->device);
+
 	ttm_bo_clean_mm(&drm->ttm.bdev, TTM_PL_VRAM);
 	ttm_bo_clean_mm(&drm->ttm.bdev, TTM_PL_TT);
 
@@ -439,4 +444,7 @@ nouveau_ttm_fini(struct nouveau_drm *drm)
 
 	arch_phys_wc_del(drm->ttm.mtrr);
 	drm->ttm.mtrr = 0;
+	arch_io_free_memtype_wc(device->func->resource_addr(device, 1),
+				device->func->resource_size(device, 1));
+
 }
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index be30861..41b72ce 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -446,6 +446,10 @@ void radeon_bo_force_delete(struct radeon_device *rdev)
 
 int radeon_bo_init(struct radeon_device *rdev)
 {
+	/* reserve PAT memory space to WC for VRAM */
+	arch_io_reserve_memtype_wc(rdev->mc.aper_base,
+				   rdev->mc.aper_size);
+
 	/* Add an MTRR for the VRAM */
 	if (!rdev->fastfb_working) {
 		rdev->mc.vram_mtrr = arch_phys_wc_add(rdev->mc.aper_base,
@@ -463,6 +467,7 @@ void radeon_bo_fini(struct radeon_device *rdev)
 {
 	radeon_ttm_fini(rdev);
 	arch_phys_wc_del(rdev->mc.vram_mtrr);
+	arch_io_free_memtype_wc(rdev->mc.aper_base, rdev->mc.aper_size);
 }
 
 /* Returns how many bytes TTM can move per IB.
-- 
2.5.5

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] drm/drivers: add support for using the arch wc mapping API.
  2016-10-24  6:31 ` [PATCH 2/2] drm/drivers: add support for using the arch wc mapping API Dave Airlie
@ 2016-10-24  9:24   ` Christian König
  0 siblings, 0 replies; 15+ messages in thread
From: Christian König @ 2016-10-24  9:24 UTC (permalink / raw)
  To: Dave Airlie, mcgrof, torvalds, dan.j.williams, x86
  Cc: linux-kernel, dri-devel

Am 24.10.2016 um 08:31 schrieb Dave Airlie:
> This fixes a regression in all these drivers since the cache
> mode tracking was fixed for mixed mappings. It uses the new
> arch API to add the VRAM range to the PAT mapping tracking
> tables.
>
> Fixes: 87744ab3832 (mm: fix cache mode tracking in vm_insert_mixed())
> Signed-off-by: Dave Airlie <airlied@redhat.com>

Reviewed-by: Christian König <christian.koenig@amd.com>.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 +++++
>   drivers/gpu/drm/ast/ast_ttm.c              | 6 ++++++
>   drivers/gpu/drm/cirrus/cirrus_ttm.c        | 7 +++++++
>   drivers/gpu/drm/mgag200/mgag200_ttm.c      | 7 +++++++
>   drivers/gpu/drm/nouveau/nouveau_ttm.c      | 8 ++++++++
>   drivers/gpu/drm/radeon/radeon_object.c     | 5 +++++
>   6 files changed, 38 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index aa074fa..f3efb1c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -754,6 +754,10 @@ static const char *amdgpu_vram_names[] = {
>   
>   int amdgpu_bo_init(struct amdgpu_device *adev)
>   {
> +	/* reserve PAT memory space to WC for VRAM */
> +	arch_io_reserve_memtype_wc(adev->mc.aper_base,
> +				   adev->mc.aper_size);
> +
>   	/* Add an MTRR for the VRAM */
>   	adev->mc.vram_mtrr = arch_phys_wc_add(adev->mc.aper_base,
>   					      adev->mc.aper_size);
> @@ -769,6 +773,7 @@ void amdgpu_bo_fini(struct amdgpu_device *adev)
>   {
>   	amdgpu_ttm_fini(adev);
>   	arch_phys_wc_del(adev->mc.vram_mtrr);
> +	arch_io_free_memtype_wc(adev->mc.aper_base, adev->mc.aper_size);
>   }
>   
>   int amdgpu_bo_fbdev_mmap(struct amdgpu_bo *bo,
> diff --git a/drivers/gpu/drm/ast/ast_ttm.c b/drivers/gpu/drm/ast/ast_ttm.c
> index 608df4c..0743e65 100644
> --- a/drivers/gpu/drm/ast/ast_ttm.c
> +++ b/drivers/gpu/drm/ast/ast_ttm.c
> @@ -267,6 +267,8 @@ int ast_mm_init(struct ast_private *ast)
>   		return ret;
>   	}
>   
> +	arch_io_reserve_memtype_wc(pci_resource_start(dev->pdev, 0),
> +				   pci_resource_len(dev->pdev, 0));
>   	ast->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0),
>   					pci_resource_len(dev->pdev, 0));
>   
> @@ -275,11 +277,15 @@ int ast_mm_init(struct ast_private *ast)
>   
>   void ast_mm_fini(struct ast_private *ast)
>   {
> +	struct drm_device *dev = ast->dev;
> +
>   	ttm_bo_device_release(&ast->ttm.bdev);
>   
>   	ast_ttm_global_release(ast);
>   
>   	arch_phys_wc_del(ast->fb_mtrr);
> +	arch_io_free_memtype_wc(pci_resource_start(dev->pdev, 0),
> +				pci_resource_len(dev->pdev, 0));
>   }
>   
>   void ast_ttm_placement(struct ast_bo *bo, int domain)
> diff --git a/drivers/gpu/drm/cirrus/cirrus_ttm.c b/drivers/gpu/drm/cirrus/cirrus_ttm.c
> index bb2438d..5e7e63c 100644
> --- a/drivers/gpu/drm/cirrus/cirrus_ttm.c
> +++ b/drivers/gpu/drm/cirrus/cirrus_ttm.c
> @@ -267,6 +267,9 @@ int cirrus_mm_init(struct cirrus_device *cirrus)
>   		return ret;
>   	}
>   
> +	arch_io_reserve_memtype_wc(pci_resource_start(dev->pdev, 0),
> +				   pci_resource_len(dev->pdev, 0));
> +
>   	cirrus->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0),
>   					   pci_resource_len(dev->pdev, 0));
>   
> @@ -276,6 +279,8 @@ int cirrus_mm_init(struct cirrus_device *cirrus)
>   
>   void cirrus_mm_fini(struct cirrus_device *cirrus)
>   {
> +	struct drm_device *dev = cirrus->dev;
> +
>   	if (!cirrus->mm_inited)
>   		return;
>   
> @@ -285,6 +290,8 @@ void cirrus_mm_fini(struct cirrus_device *cirrus)
>   
>   	arch_phys_wc_del(cirrus->fb_mtrr);
>   	cirrus->fb_mtrr = 0;
> +	arch_io_free_memtype_wc(pci_resource_start(dev->pdev, 0),
> +				pci_resource_len(dev->pdev, 0));
>   }
>   
>   void cirrus_ttm_placement(struct cirrus_bo *bo, int domain)
> diff --git a/drivers/gpu/drm/mgag200/mgag200_ttm.c b/drivers/gpu/drm/mgag200/mgag200_ttm.c
> index 919b35f..dcf7d11 100644
> --- a/drivers/gpu/drm/mgag200/mgag200_ttm.c
> +++ b/drivers/gpu/drm/mgag200/mgag200_ttm.c
> @@ -266,6 +266,9 @@ int mgag200_mm_init(struct mga_device *mdev)
>   		return ret;
>   	}
>   
> +	arch_io_reserve_memtype_wc(pci_resource_start(dev->pdev, 0),
> +				   pci_resource_len(dev->pdev, 0));
> +
>   	mdev->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0),
>   					 pci_resource_len(dev->pdev, 0));
>   
> @@ -274,10 +277,14 @@ int mgag200_mm_init(struct mga_device *mdev)
>   
>   void mgag200_mm_fini(struct mga_device *mdev)
>   {
> +	struct drm_device *dev = mdev->dev;
> +
>   	ttm_bo_device_release(&mdev->ttm.bdev);
>   
>   	mgag200_ttm_global_release(mdev);
>   
> +	arch_io_free_memtype_wc(pci_resource_start(dev->pdev, 0),
> +				pci_resource_len(dev->pdev, 0));
>   	arch_phys_wc_del(mdev->fb_mtrr);
>   	mdev->fb_mtrr = 0;
>   }
> diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> index 1825dbc..a6dbe82 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> @@ -398,6 +398,9 @@ nouveau_ttm_init(struct nouveau_drm *drm)
>   	/* VRAM init */
>   	drm->gem.vram_available = drm->device.info.ram_user;
>   
> +	arch_io_reserve_memtype_wc(device->func->resource_addr(device, 1),
> +				   device->func->resource_size(device, 1));
> +
>   	ret = ttm_bo_init_mm(&drm->ttm.bdev, TTM_PL_VRAM,
>   			      drm->gem.vram_available >> PAGE_SHIFT);
>   	if (ret) {
> @@ -430,6 +433,8 @@ nouveau_ttm_init(struct nouveau_drm *drm)
>   void
>   nouveau_ttm_fini(struct nouveau_drm *drm)
>   {
> +	struct nvkm_device *device = nvxx_device(&drm->device);
> +
>   	ttm_bo_clean_mm(&drm->ttm.bdev, TTM_PL_VRAM);
>   	ttm_bo_clean_mm(&drm->ttm.bdev, TTM_PL_TT);
>   
> @@ -439,4 +444,7 @@ nouveau_ttm_fini(struct nouveau_drm *drm)
>   
>   	arch_phys_wc_del(drm->ttm.mtrr);
>   	drm->ttm.mtrr = 0;
> +	arch_io_free_memtype_wc(device->func->resource_addr(device, 1),
> +				device->func->resource_size(device, 1));
> +
>   }
> diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
> index be30861..41b72ce 100644
> --- a/drivers/gpu/drm/radeon/radeon_object.c
> +++ b/drivers/gpu/drm/radeon/radeon_object.c
> @@ -446,6 +446,10 @@ void radeon_bo_force_delete(struct radeon_device *rdev)
>   
>   int radeon_bo_init(struct radeon_device *rdev)
>   {
> +	/* reserve PAT memory space to WC for VRAM */
> +	arch_io_reserve_memtype_wc(rdev->mc.aper_base,
> +				   rdev->mc.aper_size);
> +
>   	/* Add an MTRR for the VRAM */
>   	if (!rdev->fastfb_working) {
>   		rdev->mc.vram_mtrr = arch_phys_wc_add(rdev->mc.aper_base,
> @@ -463,6 +467,7 @@ void radeon_bo_fini(struct radeon_device *rdev)
>   {
>   	radeon_ttm_fini(rdev);
>   	arch_phys_wc_del(rdev->mc.vram_mtrr);
> +	arch_io_free_memtype_wc(rdev->mc.aper_base, rdev->mc.aper_size);
>   }
>   
>   /* Returns how many bytes TTM can move per IB.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1)
  2016-10-24  6:31 ` [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1) Dave Airlie
@ 2016-10-25  9:40   ` Ingo Molnar
  2016-10-25 11:10   ` Thomas Gleixner
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Ingo Molnar @ 2016-10-25  9:40 UTC (permalink / raw)
  To: Dave Airlie
  Cc: mcgrof, torvalds, dan.j.williams, x86, linux-kernel, dri-devel,
	Toshi Kani, Borislav Petkov, H. Peter Anvin, Andy Lutomirski,
	Denys Vlasenko, Brian Gerst


* Dave Airlie <airlied@redhat.com> wrote:

> A recent change to the mm code in:
> 87744ab3832b83ba71b931f86f9cfdb000d07da5
> mm: fix cache mode tracking in vm_insert_mixed()
> 
> started enforcing checking the memory type against the registered list for
> amixed pfn insertion mappings. It happens that the drm drivers for a number
> of gpus relied on this being broken. Currently the driver only inserted
> VRAM mappings into the tracking table when they came from the kernel,
> and userspace mappings never landed in the table. This led to a regression
> where all the mapping end up as UC instead of WC now.
> 
> I've considered a number of solutions but since this needs to be fixed
> in fixes and not next, and some of the solutions were going to introduce
> overhead that hadn't been there before I didn't consider them viable at
> this stage. These mainly concerned hooking into the TTM io reserve APIs,
> but these API have a bunch of fast paths I didn't want to unwind to add
> this to.
> 
> The solution I've decided on is to add a new API like the arch_phys_wc
> APIs (these would have worked but wc_del didn't take a range), and
> use them from the drivers to add a WC compatible mapping to the table
> for all VRAM on those GPUs. This means we can then create userspace
> mapping that won't get degraded to UC.
> 
> v1.1: use CONFIG_X86_PAT
> Cc: Toshi Kani <toshi.kani@hp.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Denys Vlasenko <dvlasenk@redhat.com>
> Cc: Brian Gerst <brgerst@gmail.com>
> Cc: x86@kernel.org
> Cc: mcgrof@suse.com
> Cc: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Dave Airlie <airlied@redhat.com>
> ---
>  arch/x86/include/asm/io.h |  6 ++++++
>  arch/x86/mm/pat.c         | 13 +++++++++++++
>  include/linux/io.h        | 13 +++++++++++++
>  3 files changed, 32 insertions(+)

These changes look good to me in principle:

  Acked-by: Ingo Molnar <mingo@kernel.org>

I think it would be best to merge these fixes via the DRM tree?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1)
  2016-10-24  6:31 ` [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1) Dave Airlie
  2016-10-25  9:40   ` Ingo Molnar
@ 2016-10-25 11:10   ` Thomas Gleixner
  2016-10-25 17:31   ` Luis R. Rodriguez
  2016-10-26 17:48   ` [PATCH] x86/pat, mm: Make track_pfn_insert() return void Borislav Petkov
  3 siblings, 0 replies; 15+ messages in thread
From: Thomas Gleixner @ 2016-10-25 11:10 UTC (permalink / raw)
  To: Dave Airlie
  Cc: mcgrof, torvalds, dan.j.williams, x86, linux-kernel, dri-devel,
	Toshi Kani, Borislav Petkov, H. Peter Anvin, Andy Lutomirski,
	Denys Vlasenko, Brian Gerst

On Mon, 24 Oct 2016, Dave Airlie wrote:
> A recent change to the mm code in:
> 87744ab3832b83ba71b931f86f9cfdb000d07da5

nit: 12 digits of the SHA1 are sufficient :)

> +int arch_io_reserve_memtype_wc(resource_size_t start, resource_size_t size)
> +{
> +	enum page_cache_mode type = _PAGE_CACHE_MODE_WC;

Empty line between variable declaration and code please

> +	return io_reserve_memtype(start, start + size, &type);

Other than that:

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1)
  2016-10-24  6:31 ` [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1) Dave Airlie
  2016-10-25  9:40   ` Ingo Molnar
  2016-10-25 11:10   ` Thomas Gleixner
@ 2016-10-25 17:31   ` Luis R. Rodriguez
  2016-10-26  5:49     ` Daniel Vetter
  2016-10-26 17:48   ` [PATCH] x86/pat, mm: Make track_pfn_insert() return void Borislav Petkov
  3 siblings, 1 reply; 15+ messages in thread
From: Luis R. Rodriguez @ 2016-10-25 17:31 UTC (permalink / raw)
  To: Dave Airlie
  Cc: torvalds, dan.j.williams, x86, linux-kernel, dri-devel,
	Toshi Kani, Borislav Petkov, H. Peter Anvin, Andy Lutomirski,
	Denys Vlasenko, Brian Gerst

On Mon, Oct 24, 2016 at 04:31:45PM +1000, Dave Airlie wrote:
> A recent change to the mm code in:
> 87744ab3832b83ba71b931f86f9cfdb000d07da5
> mm: fix cache mode tracking in vm_insert_mixed()
> 
> started enforcing checking the memory type against the registered list for
> amixed pfn insertion mappings. It happens that the drm drivers for a number
> of gpus relied on this being broken. Currently the driver only inserted
> VRAM mappings into the tracking table when they came from the kernel,
> and userspace mappings never landed in the table. This led to a regression
> where all the mapping end up as UC instead of WC now.

Eek.

> I've considered a number of solutions but since this needs to be fixed
> in fixes and not next, and some of the solutions were going to introduce
> overhead that hadn't been there before I didn't consider them viable at
> this stage. These mainly concerned hooking into the TTM io reserve APIs,
> but these API have a bunch of fast paths I didn't want to unwind to add
> this to.
> 
> The solution I've decided on is to add a new API like the arch_phys_wc
> APIs (these would have worked but wc_del didn't take a range), and
> use them from the drivers to add a WC compatible mapping to the table
> for all VRAM on those GPUs. This means we can then create userspace
> mapping that won't get degraded to UC.

Is anything on a driver to be able to tell when this is actually needed ?
How will driver developers know? Can you add a bit of documentation to
the API? If its transitive towards a secondary solution indicating so
would help driver developers.

  Luis

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1)
  2016-10-25 17:31   ` Luis R. Rodriguez
@ 2016-10-26  5:49     ` Daniel Vetter
  2016-10-26  6:12       ` Dave Airlie
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Vetter @ 2016-10-26  5:49 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Dave Airlie, Toshi Kani, Brian Gerst, x86, linux-kernel,
	dri-devel, Borislav Petkov, Andy Lutomirski, H. Peter Anvin,
	Denys Vlasenko, dan.j.williams, torvalds

On Tue, Oct 25, 2016 at 07:31:29PM +0200, Luis R. Rodriguez wrote:
> On Mon, Oct 24, 2016 at 04:31:45PM +1000, Dave Airlie wrote:
> > A recent change to the mm code in:
> > 87744ab3832b83ba71b931f86f9cfdb000d07da5
> > mm: fix cache mode tracking in vm_insert_mixed()
> > 
> > started enforcing checking the memory type against the registered list for
> > amixed pfn insertion mappings. It happens that the drm drivers for a number
> > of gpus relied on this being broken. Currently the driver only inserted
> > VRAM mappings into the tracking table when they came from the kernel,
> > and userspace mappings never landed in the table. This led to a regression
> > where all the mapping end up as UC instead of WC now.
> 
> Eek.
> 
> > I've considered a number of solutions but since this needs to be fixed
> > in fixes and not next, and some of the solutions were going to introduce
> > overhead that hadn't been there before I didn't consider them viable at
> > this stage. These mainly concerned hooking into the TTM io reserve APIs,
> > but these API have a bunch of fast paths I didn't want to unwind to add
> > this to.
> > 
> > The solution I've decided on is to add a new API like the arch_phys_wc
> > APIs (these would have worked but wc_del didn't take a range), and
> > use them from the drivers to add a WC compatible mapping to the table
> > for all VRAM on those GPUs. This means we can then create userspace
> > mapping that won't get degraded to UC.
> 
> Is anything on a driver to be able to tell when this is actually needed ?
> How will driver developers know? Can you add a bit of documentation to
> the API? If its transitive towards a secondary solution indicating so
> would help driver developers.

I'll plug the io-mapping stuff again here, and more specifically the
userspace pte wrangling stuff we've added in 4.9 to i915_mm.c. Should
probably move that one to the core. That way io_mapping takes care of the
full reservartion, and allows you to on-demand kmap (for kernel) and write
ptes. All nicely fast and all, and for bonus, also nicely encapsulated.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1)
  2016-10-26  5:49     ` Daniel Vetter
@ 2016-10-26  6:12       ` Dave Airlie
  2016-10-26  7:00         ` Daniel Vetter
  0 siblings, 1 reply; 15+ messages in thread
From: Dave Airlie @ 2016-10-26  6:12 UTC (permalink / raw)
  To: Luis R. Rodriguez, Dave Airlie, Toshi Kani, Brian Gerst, x86,
	LKML, dri-devel, Borislav Petkov, Andy Lutomirski,
	H. Peter Anvin, Denys Vlasenko, Dan Williams, Linus Torvalds

>>
>> Is anything on a driver to be able to tell when this is actually needed ?
>> How will driver developers know? Can you add a bit of documentation to
>> the API? If its transitive towards a secondary solution indicating so
>> would help driver developers.
>
> I'll plug the io-mapping stuff again here, and more specifically the
> userspace pte wrangling stuff we've added in 4.9 to i915_mm.c. Should
> probably move that one to the core. That way io_mapping takes care of the
> full reservartion, and allows you to on-demand kmap (for kernel) and write
> ptes. All nicely fast and all, and for bonus, also nicely encapsulated.

Yeah I think ideally we'd want to move towards that, however we don't tend
to want to ioremap the full range even on 64-bit, which is what io-mapping does.

At least on most GPUs with VRAM we rarely want to map VRAM for much,
I think page tables and fbcon are probably the main two uses for touch
it at all.

So I don't think we need to be as efficient as i915 in this area.

Dave.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1)
  2016-10-26  6:12       ` Dave Airlie
@ 2016-10-26  7:00         ` Daniel Vetter
  0 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2016-10-26  7:00 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Luis R. Rodriguez, Dave Airlie, Toshi Kani, Brian Gerst, X86 ML,
	LKML, dri-devel, Borislav Petkov, Andy Lutomirski,
	H. Peter Anvin, Denys Vlasenko, Dan Williams, Linus Torvalds

On Wed, Oct 26, 2016 at 8:12 AM, Dave Airlie <airlied@gmail.com> wrote:
>>> Is anything on a driver to be able to tell when this is actually needed ?
>>> How will driver developers know? Can you add a bit of documentation to
>>> the API? If its transitive towards a secondary solution indicating so
>>> would help driver developers.
>>
>> I'll plug the io-mapping stuff again here, and more specifically the
>> userspace pte wrangling stuff we've added in 4.9 to i915_mm.c. Should
>> probably move that one to the core. That way io_mapping takes care of the
>> full reservartion, and allows you to on-demand kmap (for kernel) and write
>> ptes. All nicely fast and all, and for bonus, also nicely encapsulated.
>
> Yeah I think ideally we'd want to move towards that, however we don't tend
> to want to ioremap the full range even on 64-bit, which is what io-mapping does.

Hm, I thought on 64 we have linear mappings of all the io space
anyway, and they're essentially for free. Am I wrong and there's some
overhead here too?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1)
@ 2016-10-26  7:00         ` Daniel Vetter
  0 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2016-10-26  7:00 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Toshi Kani, Brian Gerst, X86 ML, LKML, dri-devel,
	Luis R. Rodriguez, Borislav Petkov, Andy Lutomirski,
	H. Peter Anvin, Dave Airlie, Denys Vlasenko, Linus Torvalds,
	Dan Williams

On Wed, Oct 26, 2016 at 8:12 AM, Dave Airlie <airlied@gmail.com> wrote:
>>> Is anything on a driver to be able to tell when this is actually needed ?
>>> How will driver developers know? Can you add a bit of documentation to
>>> the API? If its transitive towards a secondary solution indicating so
>>> would help driver developers.
>>
>> I'll plug the io-mapping stuff again here, and more specifically the
>> userspace pte wrangling stuff we've added in 4.9 to i915_mm.c. Should
>> probably move that one to the core. That way io_mapping takes care of the
>> full reservartion, and allows you to on-demand kmap (for kernel) and write
>> ptes. All nicely fast and all, and for bonus, also nicely encapsulated.
>
> Yeah I think ideally we'd want to move towards that, however we don't tend
> to want to ioremap the full range even on 64-bit, which is what io-mapping does.

Hm, I thought on 64 we have linear mappings of all the io space
anyway, and they're essentially for free. Am I wrong and there's some
overhead here too?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] x86/pat, mm: Make track_pfn_insert() return void
  2016-10-24  6:31 ` [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1) Dave Airlie
                     ` (2 preceding siblings ...)
  2016-10-25 17:31   ` Luis R. Rodriguez
@ 2016-10-26 17:48   ` Borislav Petkov
  2016-11-09 20:42     ` [tip:mm/pat] " tip-bot for Borislav Petkov
  3 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2016-10-26 17:48 UTC (permalink / raw)
  To: Dave Airlie
  Cc: mcgrof, torvalds, dan.j.williams, x86, linux-kernel, dri-devel,
	Toshi Kani, H. Peter Anvin, Andy Lutomirski, Denys Vlasenko,
	Brian Gerst

On Mon, Oct 24, 2016 at 04:31:45PM +1000, Dave Airlie wrote:
> A recent change to the mm code in:
> 87744ab3832b83ba71b931f86f9cfdb000d07da5
> mm: fix cache mode tracking in vm_insert_mixed()

While we're at it, let's simplify that track_pfn_insert() thing:

---
>From 6feb0b253e1fcccbcbc8ab3e8838db09e39b0466 Mon Sep 17 00:00:00 2001
From: Borislav Petkov <bp@suse.de>
Date: Wed, 26 Oct 2016 19:43:43 +0200
Subject: [PATCH] x86/pat, mm: Make track_pfn_insert() return void

It only returns 0 so we can save us the testing of its retval
everywhere.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/mm/pat.c             | 7 ++-----
 include/asm-generic/pgtable.h | 9 ++++-----
 mm/huge_memory.c              | 5 +++--
 mm/memory.c                   | 8 ++++----
 4 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 170cc4ff057b..025bf1b929c0 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -972,20 +972,17 @@ int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
 	return 0;
 }
 
-int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
-		     pfn_t pfn)
+void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, pfn_t pfn)
 {
 	enum page_cache_mode pcm;
 
 	if (!pat_enabled())
-		return 0;
+		return;
 
 	/* Set prot based on lookup */
 	pcm = lookup_memtype(pfn_t_to_phys(pfn));
 	*prot = __pgprot((pgprot_val(*prot) & (~_PAGE_CACHE_MASK)) |
 			 cachemode2protval(pcm));
-
-	return 0;
 }
 
 /*
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index c4f8fd2fd384..41b95d82a185 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -558,10 +558,9 @@ static inline int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
  * track_pfn_insert is called when a _new_ single pfn is established
  * by vm_insert_pfn().
  */
-static inline int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
-				   pfn_t pfn)
+static inline void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
+				    pfn_t pfn)
 {
-	return 0;
 }
 
 /*
@@ -593,8 +592,8 @@ static inline void untrack_pfn_moved(struct vm_area_struct *vma)
 extern int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
 			   unsigned long pfn, unsigned long addr,
 			   unsigned long size);
-extern int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
-			    pfn_t pfn);
+extern void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
+			     pfn_t pfn);
 extern int track_pfn_copy(struct vm_area_struct *vma);
 extern void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 			unsigned long size);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index cdcd25cb30fe..113aaa4278b9 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -737,8 +737,9 @@ int vmf_insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return VM_FAULT_SIGBUS;
-	if (track_pfn_insert(vma, &pgprot, pfn))
-		return VM_FAULT_SIGBUS;
+
+	track_pfn_insert(vma, &pgprot, pfn);
+
 	insert_pfn_pmd(vma, addr, pmd, pfn, pgprot, write);
 	return VM_FAULT_NOPAGE;
 }
diff --git a/mm/memory.c b/mm/memory.c
index e18c57bdc75c..33f45edf8272 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1637,8 +1637,8 @@ int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return -EFAULT;
-	if (track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV)))
-		return -EINVAL;
+
+	track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV));
 
 	ret = insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot);
 
@@ -1655,8 +1655,8 @@ int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return -EFAULT;
-	if (track_pfn_insert(vma, &pgprot, pfn))
-		return -EINVAL;
+
+	track_pfn_insert(vma, &pgprot, pfn);
 
 	/*
 	 * If we don't have pte special, then we have to use the pfn_valid()
-- 
2.10.0

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] x86/pat, mm: Make track_pfn_insert() return void
@ 2016-10-26 17:48   ` Borislav Petkov
  2016-11-09 20:42     ` [tip:mm/pat] " tip-bot for Borislav Petkov
  0 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2016-10-26 17:48 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Toshi Kani, Brian Gerst, mcgrof, x86, linux-kernel, dri-devel,
	Andy Lutomirski, H. Peter Anvin, Denys Vlasenko, dan.j.williams,
	torvalds

On Mon, Oct 24, 2016 at 04:31:45PM +1000, Dave Airlie wrote:
> A recent change to the mm code in:
> 87744ab3832b83ba71b931f86f9cfdb000d07da5
> mm: fix cache mode tracking in vm_insert_mixed()

While we're at it, let's simplify that track_pfn_insert() thing:

---
From 6feb0b253e1fcccbcbc8ab3e8838db09e39b0466 Mon Sep 17 00:00:00 2001
From: Borislav Petkov <bp@suse.de>
Date: Wed, 26 Oct 2016 19:43:43 +0200
Subject: [PATCH] x86/pat, mm: Make track_pfn_insert() return void

It only returns 0 so we can save us the testing of its retval
everywhere.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/mm/pat.c             | 7 ++-----
 include/asm-generic/pgtable.h | 9 ++++-----
 mm/huge_memory.c              | 5 +++--
 mm/memory.c                   | 8 ++++----
 4 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 170cc4ff057b..025bf1b929c0 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -972,20 +972,17 @@ int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
 	return 0;
 }
 
-int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
-		     pfn_t pfn)
+void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, pfn_t pfn)
 {
 	enum page_cache_mode pcm;
 
 	if (!pat_enabled())
-		return 0;
+		return;
 
 	/* Set prot based on lookup */
 	pcm = lookup_memtype(pfn_t_to_phys(pfn));
 	*prot = __pgprot((pgprot_val(*prot) & (~_PAGE_CACHE_MASK)) |
 			 cachemode2protval(pcm));
-
-	return 0;
 }
 
 /*
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index c4f8fd2fd384..41b95d82a185 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -558,10 +558,9 @@ static inline int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
  * track_pfn_insert is called when a _new_ single pfn is established
  * by vm_insert_pfn().
  */
-static inline int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
-				   pfn_t pfn)
+static inline void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
+				    pfn_t pfn)
 {
-	return 0;
 }
 
 /*
@@ -593,8 +592,8 @@ static inline void untrack_pfn_moved(struct vm_area_struct *vma)
 extern int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
 			   unsigned long pfn, unsigned long addr,
 			   unsigned long size);
-extern int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
-			    pfn_t pfn);
+extern void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
+			     pfn_t pfn);
 extern int track_pfn_copy(struct vm_area_struct *vma);
 extern void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 			unsigned long size);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index cdcd25cb30fe..113aaa4278b9 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -737,8 +737,9 @@ int vmf_insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return VM_FAULT_SIGBUS;
-	if (track_pfn_insert(vma, &pgprot, pfn))
-		return VM_FAULT_SIGBUS;
+
+	track_pfn_insert(vma, &pgprot, pfn);
+
 	insert_pfn_pmd(vma, addr, pmd, pfn, pgprot, write);
 	return VM_FAULT_NOPAGE;
 }
diff --git a/mm/memory.c b/mm/memory.c
index e18c57bdc75c..33f45edf8272 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1637,8 +1637,8 @@ int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return -EFAULT;
-	if (track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV)))
-		return -EINVAL;
+
+	track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV));
 
 	ret = insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot);
 
@@ -1655,8 +1655,8 @@ int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return -EFAULT;
-	if (track_pfn_insert(vma, &pgprot, pfn))
-		return -EINVAL;
+
+	track_pfn_insert(vma, &pgprot, pfn);
 
 	/*
 	 * If we don't have pte special, then we have to use the pfn_valid()
-- 
2.10.0

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip:mm/pat] x86/pat, mm: Make track_pfn_insert() return void
  2016-10-26 17:48   ` [PATCH] x86/pat, mm: Make track_pfn_insert() return void Borislav Petkov
@ 2016-11-09 20:42     ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Borislav Petkov @ 2016-11-09 20:42 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dvlasenk, tglx, hpa, brgerst, airlied, linux-kernel, luto,
	toshi.kani, mingo, bp

Commit-ID:  308a047c3f6b61cc4007c0051fe420197ea58f86
Gitweb:     http://git.kernel.org/tip/308a047c3f6b61cc4007c0051fe420197ea58f86
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Wed, 26 Oct 2016 19:43:43 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 9 Nov 2016 21:36:07 +0100

x86/pat, mm: Make track_pfn_insert() return void

It only returns 0 so we can save us the testing of its retval
everywhere.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: mcgrof@suse.com
Cc: dri-devel@lists.freedesktop.org
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: dan.j.williams@intel.com
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/20161026174839.rusfxkm3xt4ennhe@pd.tnic
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/mm/pat.c             | 7 ++-----
 include/asm-generic/pgtable.h | 9 ++++-----
 mm/huge_memory.c              | 5 +++--
 mm/memory.c                   | 8 ++++----
 4 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 83e701f..efc32bc 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -986,20 +986,17 @@ int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
 	return 0;
 }
 
-int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
-		     pfn_t pfn)
+void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, pfn_t pfn)
 {
 	enum page_cache_mode pcm;
 
 	if (!pat_enabled())
-		return 0;
+		return;
 
 	/* Set prot based on lookup */
 	pcm = lookup_memtype(pfn_t_to_phys(pfn));
 	*prot = __pgprot((pgprot_val(*prot) & (~_PAGE_CACHE_MASK)) |
 			 cachemode2protval(pcm));
-
-	return 0;
 }
 
 /*
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index c4f8fd2..41b95d8 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -558,10 +558,9 @@ static inline int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
  * track_pfn_insert is called when a _new_ single pfn is established
  * by vm_insert_pfn().
  */
-static inline int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
-				   pfn_t pfn)
+static inline void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
+				    pfn_t pfn)
 {
-	return 0;
 }
 
 /*
@@ -593,8 +592,8 @@ static inline void untrack_pfn_moved(struct vm_area_struct *vma)
 extern int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
 			   unsigned long pfn, unsigned long addr,
 			   unsigned long size);
-extern int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
-			    pfn_t pfn);
+extern void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
+			     pfn_t pfn);
 extern int track_pfn_copy(struct vm_area_struct *vma);
 extern void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 			unsigned long size);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index cdcd25c..113aaa4 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -737,8 +737,9 @@ int vmf_insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return VM_FAULT_SIGBUS;
-	if (track_pfn_insert(vma, &pgprot, pfn))
-		return VM_FAULT_SIGBUS;
+
+	track_pfn_insert(vma, &pgprot, pfn);
+
 	insert_pfn_pmd(vma, addr, pmd, pfn, pgprot, write);
 	return VM_FAULT_NOPAGE;
 }
diff --git a/mm/memory.c b/mm/memory.c
index e18c57b..33f45ed 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1637,8 +1637,8 @@ int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return -EFAULT;
-	if (track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV)))
-		return -EINVAL;
+
+	track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV));
 
 	ret = insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot);
 
@@ -1655,8 +1655,8 @@ int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return -EFAULT;
-	if (track_pfn_insert(vma, &pgprot, pfn))
-		return -EINVAL;
+
+	track_pfn_insert(vma, &pgprot, pfn);
 
 	/*
 	 * If we don't have pte special, then we have to use the pfn_valid()

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-11-09 20:43 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-24  6:31 x86 PAT memtype regression fixes (with extra cc's) Dave Airlie
2016-10-24  6:31 ` [PATCH 1/2] x86/io: add interface to reserve io memtype for a resource range. (v1.1) Dave Airlie
2016-10-25  9:40   ` Ingo Molnar
2016-10-25 11:10   ` Thomas Gleixner
2016-10-25 17:31   ` Luis R. Rodriguez
2016-10-26  5:49     ` Daniel Vetter
2016-10-26  6:12       ` Dave Airlie
2016-10-26  7:00         ` Daniel Vetter
2016-10-26 17:48   ` [PATCH] x86/pat, mm: Make track_pfn_insert() return void Borislav Petkov
2016-11-09 20:42     ` [tip:mm/pat] " tip-bot for Borislav Petkov
2016-10-24  6:31 ` [PATCH 2/2] drm/drivers: add support for using the arch wc mapping API Dave Airlie
2016-10-24  9:24   ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.