All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] x86/pat: fix querying available caching modes
@ 2022-05-03 13:22 ` Juergen Gross
  0 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-03 13:22 UTC (permalink / raw)
  To: xen-devel, x86, linux-kernel, intel-gfx, dri-devel
  Cc: jbeulich, Juergen Gross, Dave Hansen, Andy Lutomirski,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	Tvrtko Ursulin, David Airlie, Daniel Vetter

Fix some issues with querying caching modes being available for memory
mappings.

This is a replacement for the patch of Jan sent recently:

https://lists.xen.org/archives/html/xen-devel/2022-04/msg02392.html

Juergen Gross (2):
  x86/pat: fix x86_has_pat_wp()
  x86/pat: add functions to query specific cache mode availability

 arch/x86/include/asm/memtype.h           |  2 ++
 arch/x86/include/asm/pci.h               |  2 +-
 arch/x86/mm/init.c                       | 24 ++++++++++++++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 ++++----
 4 files changed, 29 insertions(+), 7 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 0/2] x86/pat: fix querying available caching modes
@ 2022-05-03 13:22 ` Juergen Gross
  0 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-03 13:22 UTC (permalink / raw)
  To: xen-devel, x86, linux-kernel, intel-gfx, dri-devel
  Cc: Juergen Gross, Tvrtko Ursulin, jbeulich, Peter Zijlstra,
	Dave Hansen, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, Andy Lutomirski, H. Peter Anvin,
	Thomas Gleixner

Fix some issues with querying caching modes being available for memory
mappings.

This is a replacement for the patch of Jan sent recently:

https://lists.xen.org/archives/html/xen-devel/2022-04/msg02392.html

Juergen Gross (2):
  x86/pat: fix x86_has_pat_wp()
  x86/pat: add functions to query specific cache mode availability

 arch/x86/include/asm/memtype.h           |  2 ++
 arch/x86/include/asm/pci.h               |  2 +-
 arch/x86/mm/init.c                       | 24 ++++++++++++++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 ++++----
 4 files changed, 29 insertions(+), 7 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Intel-gfx] [PATCH 0/2] x86/pat: fix querying available caching modes
@ 2022-05-03 13:22 ` Juergen Gross
  0 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-03 13:22 UTC (permalink / raw)
  To: xen-devel, x86, linux-kernel, intel-gfx, dri-devel
  Cc: Juergen Gross, jbeulich, Peter Zijlstra, Dave Hansen,
	David Airlie, Rodrigo Vivi, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, Thomas Gleixner

Fix some issues with querying caching modes being available for memory
mappings.

This is a replacement for the patch of Jan sent recently:

https://lists.xen.org/archives/html/xen-devel/2022-04/msg02392.html

Juergen Gross (2):
  x86/pat: fix x86_has_pat_wp()
  x86/pat: add functions to query specific cache mode availability

 arch/x86/include/asm/memtype.h           |  2 ++
 arch/x86/include/asm/pci.h               |  2 +-
 arch/x86/mm/init.c                       | 24 ++++++++++++++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 ++++----
 4 files changed, 29 insertions(+), 7 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 1/2] x86/pat: fix x86_has_pat_wp()
  2022-05-03 13:22 ` Juergen Gross
  (?)
  (?)
@ 2022-05-03 13:22 ` Juergen Gross
  2022-05-27 10:21   ` Juergen Gross
                     ` (2 more replies)
  -1 siblings, 3 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-03 13:22 UTC (permalink / raw)
  To: xen-devel, x86, linux-kernel
  Cc: jbeulich, Juergen Gross, Dave Hansen, Andy Lutomirski,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin

x86_has_pat_wp() is using a wrong test, as it relies on the normal
PAT configuration used by the kernel. In case the PAT MSR has been
setup by another entity (e.g. BIOS or Xen hypervisor) it might return
false even if the PAT configuration is allowing WP mappings.

Fixes: 1f6f655e01ad ("x86/mm: Add a x86_has_pat_wp() helper")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 arch/x86/mm/init.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index d8cfce221275..71e182ebced3 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -80,7 +80,8 @@ static uint8_t __pte2cachemode_tbl[8] = {
 /* Check that the write-protect PAT entry is set for write-protect */
 bool x86_has_pat_wp(void)
 {
-	return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] == _PAGE_CACHE_MODE_WP;
+	return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
+	       _PAGE_CACHE_MODE_WP;
 }
 
 enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-03 13:22 ` Juergen Gross
  (?)
@ 2022-05-03 13:22   ` Juergen Gross
  -1 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-03 13:22 UTC (permalink / raw)
  To: xen-devel, x86, linux-kernel, intel-gfx, dri-devel
  Cc: jbeulich, Juergen Gross, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	Tvrtko Ursulin, David Airlie, Daniel Vetter

Some drivers are using pat_enabled() in order to test availability of
special caching modes (WC and UC-). This will lead to false negatives
in case the system was booted e.g. with the "nopat" variant and the
BIOS did setup the PAT MSR supporting the queried mode, or if the
system is running as a Xen PV guest.

Add test functions for those caching modes instead and use them at the
appropriate places.

For symmetry reasons export the already existing x86_has_pat_wp() for
modules, too.

Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 arch/x86/include/asm/memtype.h           |  2 ++
 arch/x86/include/asm/pci.h               |  2 +-
 arch/x86/mm/init.c                       | 25 +++++++++++++++++++++---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 ++++----
 4 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/memtype.h b/arch/x86/include/asm/memtype.h
index 9ca760e430b9..d00e0be854d4 100644
--- a/arch/x86/include/asm/memtype.h
+++ b/arch/x86/include/asm/memtype.h
@@ -25,6 +25,8 @@ extern void memtype_free_io(resource_size_t start, resource_size_t end);
 extern bool pat_pfn_immune_to_uc_mtrr(unsigned long pfn);
 
 bool x86_has_pat_wp(void);
+bool x86_has_pat_wc(void);
+bool x86_has_pat_uc_minus(void);
 enum page_cache_mode pgprot2cachemode(pgprot_t pgprot);
 
 #endif /* _ASM_X86_MEMTYPE_H */
diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index f3fd5928bcbb..a5742268dec1 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
 
 
 #define HAVE_PCI_MMAP
-#define arch_can_pci_mmap_wc()	pat_enabled()
+#define arch_can_pci_mmap_wc()	x86_has_pat_wc()
 #define ARCH_GENERIC_PCI_MMAP_RESOURCE
 
 #ifdef CONFIG_PCI
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 71e182ebced3..b6431f714dc2 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -77,12 +77,31 @@ static uint8_t __pte2cachemode_tbl[8] = {
 	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
 };
 
-/* Check that the write-protect PAT entry is set for write-protect */
+static bool x86_has_pat_mode(unsigned int mode)
+{
+	return __pte2cachemode_tbl[__cachemode2pte_tbl[mode]] == mode;
+}
+
+/* Check that PAT supports write-protect */
 bool x86_has_pat_wp(void)
 {
-	return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
-	       _PAGE_CACHE_MODE_WP;
+	return x86_has_pat_mode(_PAGE_CACHE_MODE_WP);
+}
+EXPORT_SYMBOL_GPL(x86_has_pat_wp);
+
+/* Check that PAT supports WC */
+bool x86_has_pat_wc(void)
+{
+	return x86_has_pat_mode(_PAGE_CACHE_MODE_WC);
+}
+EXPORT_SYMBOL_GPL(x86_has_pat_wc);
+
+/* Check that PAT supports UC- */
+bool x86_has_pat_uc_minus(void)
+{
+	return x86_has_pat_mode(_PAGE_CACHE_MODE_UC_MINUS);
 }
+EXPORT_SYMBOL_GPL(x86_has_pat_uc_minus);
 
 enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 0c5c43852e24..f43ecf3f63eb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
 	if (args->flags & ~(I915_MMAP_WC))
 		return -EINVAL;
 
-	if (args->flags & I915_MMAP_WC && !pat_enabled())
+	if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
 		return -ENODEV;
 
 	obj = i915_gem_object_lookup(file, args->handle);
@@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
 
 	if (HAS_LMEM(to_i915(dev)))
 		mmap_type = I915_MMAP_TYPE_FIXED;
-	else if (pat_enabled())
+	else if (x86_has_pat_wc())
 		mmap_type = I915_MMAP_TYPE_WC;
 	else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
 		return -ENODEV;
@@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
 		break;
 
 	case I915_MMAP_OFFSET_WC:
-		if (!pat_enabled())
+		if (!x86_has_pat_wc())
 			return -ENODEV;
 		type = I915_MMAP_TYPE_WC;
 		break;
@@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
 		break;
 
 	case I915_MMAP_OFFSET_UC:
-		if (!pat_enabled())
+		if (!x86_has_pat_uc_minus())
 			return -ENODEV;
 		type = I915_MMAP_TYPE_UC;
 		break;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-03 13:22   ` Juergen Gross
  0 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-03 13:22 UTC (permalink / raw)
  To: xen-devel, x86, linux-kernel, intel-gfx, dri-devel
  Cc: Juergen Gross, Tvrtko Ursulin, Andy Lutomirski, Peter Zijlstra,
	Dave Hansen, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, jbeulich, H. Peter Anvin, Thomas Gleixner

Some drivers are using pat_enabled() in order to test availability of
special caching modes (WC and UC-). This will lead to false negatives
in case the system was booted e.g. with the "nopat" variant and the
BIOS did setup the PAT MSR supporting the queried mode, or if the
system is running as a Xen PV guest.

Add test functions for those caching modes instead and use them at the
appropriate places.

For symmetry reasons export the already existing x86_has_pat_wp() for
modules, too.

Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 arch/x86/include/asm/memtype.h           |  2 ++
 arch/x86/include/asm/pci.h               |  2 +-
 arch/x86/mm/init.c                       | 25 +++++++++++++++++++++---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 ++++----
 4 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/memtype.h b/arch/x86/include/asm/memtype.h
index 9ca760e430b9..d00e0be854d4 100644
--- a/arch/x86/include/asm/memtype.h
+++ b/arch/x86/include/asm/memtype.h
@@ -25,6 +25,8 @@ extern void memtype_free_io(resource_size_t start, resource_size_t end);
 extern bool pat_pfn_immune_to_uc_mtrr(unsigned long pfn);
 
 bool x86_has_pat_wp(void);
+bool x86_has_pat_wc(void);
+bool x86_has_pat_uc_minus(void);
 enum page_cache_mode pgprot2cachemode(pgprot_t pgprot);
 
 #endif /* _ASM_X86_MEMTYPE_H */
diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index f3fd5928bcbb..a5742268dec1 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
 
 
 #define HAVE_PCI_MMAP
-#define arch_can_pci_mmap_wc()	pat_enabled()
+#define arch_can_pci_mmap_wc()	x86_has_pat_wc()
 #define ARCH_GENERIC_PCI_MMAP_RESOURCE
 
 #ifdef CONFIG_PCI
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 71e182ebced3..b6431f714dc2 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -77,12 +77,31 @@ static uint8_t __pte2cachemode_tbl[8] = {
 	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
 };
 
-/* Check that the write-protect PAT entry is set for write-protect */
+static bool x86_has_pat_mode(unsigned int mode)
+{
+	return __pte2cachemode_tbl[__cachemode2pte_tbl[mode]] == mode;
+}
+
+/* Check that PAT supports write-protect */
 bool x86_has_pat_wp(void)
 {
-	return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
-	       _PAGE_CACHE_MODE_WP;
+	return x86_has_pat_mode(_PAGE_CACHE_MODE_WP);
+}
+EXPORT_SYMBOL_GPL(x86_has_pat_wp);
+
+/* Check that PAT supports WC */
+bool x86_has_pat_wc(void)
+{
+	return x86_has_pat_mode(_PAGE_CACHE_MODE_WC);
+}
+EXPORT_SYMBOL_GPL(x86_has_pat_wc);
+
+/* Check that PAT supports UC- */
+bool x86_has_pat_uc_minus(void)
+{
+	return x86_has_pat_mode(_PAGE_CACHE_MODE_UC_MINUS);
 }
+EXPORT_SYMBOL_GPL(x86_has_pat_uc_minus);
 
 enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 0c5c43852e24..f43ecf3f63eb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
 	if (args->flags & ~(I915_MMAP_WC))
 		return -EINVAL;
 
-	if (args->flags & I915_MMAP_WC && !pat_enabled())
+	if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
 		return -ENODEV;
 
 	obj = i915_gem_object_lookup(file, args->handle);
@@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
 
 	if (HAS_LMEM(to_i915(dev)))
 		mmap_type = I915_MMAP_TYPE_FIXED;
-	else if (pat_enabled())
+	else if (x86_has_pat_wc())
 		mmap_type = I915_MMAP_TYPE_WC;
 	else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
 		return -ENODEV;
@@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
 		break;
 
 	case I915_MMAP_OFFSET_WC:
-		if (!pat_enabled())
+		if (!x86_has_pat_wc())
 			return -ENODEV;
 		type = I915_MMAP_TYPE_WC;
 		break;
@@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
 		break;
 
 	case I915_MMAP_OFFSET_UC:
-		if (!pat_enabled())
+		if (!x86_has_pat_uc_minus())
 			return -ENODEV;
 		type = I915_MMAP_TYPE_UC;
 		break;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Intel-gfx] [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-03 13:22   ` Juergen Gross
  0 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-03 13:22 UTC (permalink / raw)
  To: xen-devel, x86, linux-kernel, intel-gfx, dri-devel
  Cc: Juergen Gross, Andy Lutomirski, Peter Zijlstra, Dave Hansen,
	David Airlie, Rodrigo Vivi, Ingo Molnar, Borislav Petkov,
	jbeulich, H. Peter Anvin, Thomas Gleixner

Some drivers are using pat_enabled() in order to test availability of
special caching modes (WC and UC-). This will lead to false negatives
in case the system was booted e.g. with the "nopat" variant and the
BIOS did setup the PAT MSR supporting the queried mode, or if the
system is running as a Xen PV guest.

Add test functions for those caching modes instead and use them at the
appropriate places.

For symmetry reasons export the already existing x86_has_pat_wp() for
modules, too.

Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 arch/x86/include/asm/memtype.h           |  2 ++
 arch/x86/include/asm/pci.h               |  2 +-
 arch/x86/mm/init.c                       | 25 +++++++++++++++++++++---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 ++++----
 4 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/memtype.h b/arch/x86/include/asm/memtype.h
index 9ca760e430b9..d00e0be854d4 100644
--- a/arch/x86/include/asm/memtype.h
+++ b/arch/x86/include/asm/memtype.h
@@ -25,6 +25,8 @@ extern void memtype_free_io(resource_size_t start, resource_size_t end);
 extern bool pat_pfn_immune_to_uc_mtrr(unsigned long pfn);
 
 bool x86_has_pat_wp(void);
+bool x86_has_pat_wc(void);
+bool x86_has_pat_uc_minus(void);
 enum page_cache_mode pgprot2cachemode(pgprot_t pgprot);
 
 #endif /* _ASM_X86_MEMTYPE_H */
diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index f3fd5928bcbb..a5742268dec1 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
 
 
 #define HAVE_PCI_MMAP
-#define arch_can_pci_mmap_wc()	pat_enabled()
+#define arch_can_pci_mmap_wc()	x86_has_pat_wc()
 #define ARCH_GENERIC_PCI_MMAP_RESOURCE
 
 #ifdef CONFIG_PCI
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 71e182ebced3..b6431f714dc2 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -77,12 +77,31 @@ static uint8_t __pte2cachemode_tbl[8] = {
 	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
 };
 
-/* Check that the write-protect PAT entry is set for write-protect */
+static bool x86_has_pat_mode(unsigned int mode)
+{
+	return __pte2cachemode_tbl[__cachemode2pte_tbl[mode]] == mode;
+}
+
+/* Check that PAT supports write-protect */
 bool x86_has_pat_wp(void)
 {
-	return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
-	       _PAGE_CACHE_MODE_WP;
+	return x86_has_pat_mode(_PAGE_CACHE_MODE_WP);
+}
+EXPORT_SYMBOL_GPL(x86_has_pat_wp);
+
+/* Check that PAT supports WC */
+bool x86_has_pat_wc(void)
+{
+	return x86_has_pat_mode(_PAGE_CACHE_MODE_WC);
+}
+EXPORT_SYMBOL_GPL(x86_has_pat_wc);
+
+/* Check that PAT supports UC- */
+bool x86_has_pat_uc_minus(void)
+{
+	return x86_has_pat_mode(_PAGE_CACHE_MODE_UC_MINUS);
 }
+EXPORT_SYMBOL_GPL(x86_has_pat_uc_minus);
 
 enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 0c5c43852e24..f43ecf3f63eb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
 	if (args->flags & ~(I915_MMAP_WC))
 		return -EINVAL;
 
-	if (args->flags & I915_MMAP_WC && !pat_enabled())
+	if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
 		return -ENODEV;
 
 	obj = i915_gem_object_lookup(file, args->handle);
@@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
 
 	if (HAS_LMEM(to_i915(dev)))
 		mmap_type = I915_MMAP_TYPE_FIXED;
-	else if (pat_enabled())
+	else if (x86_has_pat_wc())
 		mmap_type = I915_MMAP_TYPE_WC;
 	else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
 		return -ENODEV;
@@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
 		break;
 
 	case I915_MMAP_OFFSET_WC:
-		if (!pat_enabled())
+		if (!x86_has_pat_wc())
 			return -ENODEV;
 		type = I915_MMAP_TYPE_WC;
 		break;
@@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
 		break;
 
 	case I915_MMAP_OFFSET_UC:
-		if (!pat_enabled())
+		if (!x86_has_pat_uc_minus())
 			return -ENODEV;
 		type = I915_MMAP_TYPE_UC;
 		break;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-03 13:22   ` Juergen Gross
  (?)
@ 2022-05-04  8:31     ` Jan Beulich
  -1 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-04  8:31 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel

On 03.05.2022 15:22, Juergen Gross wrote:
> Some drivers are using pat_enabled() in order to test availability of
> special caching modes (WC and UC-). This will lead to false negatives
> in case the system was booted e.g. with the "nopat" variant and the
> BIOS did setup the PAT MSR supporting the queried mode, or if the
> system is running as a Xen PV guest.

While, as per my earlier patch, I agree with the Xen PV case, I'm not
convinced "nopat" is supposed to honor firmware-provided settings. In
fact in my patch I did arrange for "nopat" to also take effect under
Xen PV.

> Add test functions for those caching modes instead and use them at the
> appropriate places.
> 
> For symmetry reasons export the already existing x86_has_pat_wp() for
> modules, too.
> 
> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
> Signed-off-by: Juergen Gross <jgross@suse.com>

I think this wants a Reported-by as well.

> --- a/arch/x86/include/asm/pci.h
> +++ b/arch/x86/include/asm/pci.h
> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
>  
>  
>  #define HAVE_PCI_MMAP
> -#define arch_can_pci_mmap_wc()	pat_enabled()
> +#define arch_can_pci_mmap_wc()	x86_has_pat_wc()

Besides this and ...

> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>  	if (args->flags & ~(I915_MMAP_WC))
>  		return -EINVAL;
>  
> -	if (args->flags & I915_MMAP_WC && !pat_enabled())
> +	if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>  		return -ENODEV;
>  
>  	obj = i915_gem_object_lookup(file, args->handle);
> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>  
>  	if (HAS_LMEM(to_i915(dev)))
>  		mmap_type = I915_MMAP_TYPE_FIXED;
> -	else if (pat_enabled())
> +	else if (x86_has_pat_wc())
>  		mmap_type = I915_MMAP_TYPE_WC;
>  	else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>  		return -ENODEV;
> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>  		break;
>  
>  	case I915_MMAP_OFFSET_WC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_wc())
>  			return -ENODEV;
>  		type = I915_MMAP_TYPE_WC;
>  		break;
> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>  		break;
>  
>  	case I915_MMAP_OFFSET_UC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_uc_minus())
>  			return -ENODEV;
>  		type = I915_MMAP_TYPE_UC;
>  		break;

... these uses there are several more. You say nothing on why those want
leaving unaltered. When preparing my earlier patch I did inspect them
and came to the conclusion that these all would also better observe the
adjusted behavior (or else I couldn't have left pat_enabled() as the only
predicate). In fact, as said in the description of my earlier patch, in
my debugging I did find the use in i915_gem_object_pin_map() to be the
problematic one, which you leave alone.

Jan


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-04  8:31     ` Jan Beulich
  0 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-04  8:31 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Tvrtko Ursulin, Peter Zijlstra, intel-gfx, Dave Hansen, x86,
	linux-kernel, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, dri-devel, Andy Lutomirski, H. Peter Anvin,
	xen-devel, Thomas Gleixner

On 03.05.2022 15:22, Juergen Gross wrote:
> Some drivers are using pat_enabled() in order to test availability of
> special caching modes (WC and UC-). This will lead to false negatives
> in case the system was booted e.g. with the "nopat" variant and the
> BIOS did setup the PAT MSR supporting the queried mode, or if the
> system is running as a Xen PV guest.

While, as per my earlier patch, I agree with the Xen PV case, I'm not
convinced "nopat" is supposed to honor firmware-provided settings. In
fact in my patch I did arrange for "nopat" to also take effect under
Xen PV.

> Add test functions for those caching modes instead and use them at the
> appropriate places.
> 
> For symmetry reasons export the already existing x86_has_pat_wp() for
> modules, too.
> 
> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
> Signed-off-by: Juergen Gross <jgross@suse.com>

I think this wants a Reported-by as well.

> --- a/arch/x86/include/asm/pci.h
> +++ b/arch/x86/include/asm/pci.h
> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
>  
>  
>  #define HAVE_PCI_MMAP
> -#define arch_can_pci_mmap_wc()	pat_enabled()
> +#define arch_can_pci_mmap_wc()	x86_has_pat_wc()

Besides this and ...

> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>  	if (args->flags & ~(I915_MMAP_WC))
>  		return -EINVAL;
>  
> -	if (args->flags & I915_MMAP_WC && !pat_enabled())
> +	if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>  		return -ENODEV;
>  
>  	obj = i915_gem_object_lookup(file, args->handle);
> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>  
>  	if (HAS_LMEM(to_i915(dev)))
>  		mmap_type = I915_MMAP_TYPE_FIXED;
> -	else if (pat_enabled())
> +	else if (x86_has_pat_wc())
>  		mmap_type = I915_MMAP_TYPE_WC;
>  	else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>  		return -ENODEV;
> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>  		break;
>  
>  	case I915_MMAP_OFFSET_WC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_wc())
>  			return -ENODEV;
>  		type = I915_MMAP_TYPE_WC;
>  		break;
> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>  		break;
>  
>  	case I915_MMAP_OFFSET_UC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_uc_minus())
>  			return -ENODEV;
>  		type = I915_MMAP_TYPE_UC;
>  		break;

... these uses there are several more. You say nothing on why those want
leaving unaltered. When preparing my earlier patch I did inspect them
and came to the conclusion that these all would also better observe the
adjusted behavior (or else I couldn't have left pat_enabled() as the only
predicate). In fact, as said in the description of my earlier patch, in
my debugging I did find the use in i915_gem_object_pin_map() to be the
problematic one, which you leave alone.

Jan


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-04  8:31     ` Jan Beulich
  0 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-04  8:31 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Peter Zijlstra, intel-gfx, Dave Hansen, x86, linux-kernel,
	David Airlie, Rodrigo Vivi, Ingo Molnar, Borislav Petkov,
	dri-devel, Andy Lutomirski, H. Peter Anvin, xen-devel,
	Thomas Gleixner

On 03.05.2022 15:22, Juergen Gross wrote:
> Some drivers are using pat_enabled() in order to test availability of
> special caching modes (WC and UC-). This will lead to false negatives
> in case the system was booted e.g. with the "nopat" variant and the
> BIOS did setup the PAT MSR supporting the queried mode, or if the
> system is running as a Xen PV guest.

While, as per my earlier patch, I agree with the Xen PV case, I'm not
convinced "nopat" is supposed to honor firmware-provided settings. In
fact in my patch I did arrange for "nopat" to also take effect under
Xen PV.

> Add test functions for those caching modes instead and use them at the
> appropriate places.
> 
> For symmetry reasons export the already existing x86_has_pat_wp() for
> modules, too.
> 
> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
> Signed-off-by: Juergen Gross <jgross@suse.com>

I think this wants a Reported-by as well.

> --- a/arch/x86/include/asm/pci.h
> +++ b/arch/x86/include/asm/pci.h
> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
>  
>  
>  #define HAVE_PCI_MMAP
> -#define arch_can_pci_mmap_wc()	pat_enabled()
> +#define arch_can_pci_mmap_wc()	x86_has_pat_wc()

Besides this and ...

> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>  	if (args->flags & ~(I915_MMAP_WC))
>  		return -EINVAL;
>  
> -	if (args->flags & I915_MMAP_WC && !pat_enabled())
> +	if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>  		return -ENODEV;
>  
>  	obj = i915_gem_object_lookup(file, args->handle);
> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>  
>  	if (HAS_LMEM(to_i915(dev)))
>  		mmap_type = I915_MMAP_TYPE_FIXED;
> -	else if (pat_enabled())
> +	else if (x86_has_pat_wc())
>  		mmap_type = I915_MMAP_TYPE_WC;
>  	else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>  		return -ENODEV;
> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>  		break;
>  
>  	case I915_MMAP_OFFSET_WC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_wc())
>  			return -ENODEV;
>  		type = I915_MMAP_TYPE_WC;
>  		break;
> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>  		break;
>  
>  	case I915_MMAP_OFFSET_UC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_uc_minus())
>  			return -ENODEV;
>  		type = I915_MMAP_TYPE_UC;
>  		break;

... these uses there are several more. You say nothing on why those want
leaving unaltered. When preparing my earlier patch I did inspect them
and came to the conclusion that these all would also better observe the
adjusted behavior (or else I couldn't have left pat_enabled() as the only
predicate). In fact, as said in the description of my earlier patch, in
my debugging I did find the use in i915_gem_object_pin_map() to be the
problematic one, which you leave alone.

Jan


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-04  8:31     ` Jan Beulich
  (?)
@ 2022-05-04  9:14       ` Juergen Gross
  -1 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-04  9:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 4029 bytes --]

On 04.05.22 10:31, Jan Beulich wrote:
> On 03.05.2022 15:22, Juergen Gross wrote:
>> Some drivers are using pat_enabled() in order to test availability of
>> special caching modes (WC and UC-). This will lead to false negatives
>> in case the system was booted e.g. with the "nopat" variant and the
>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>> system is running as a Xen PV guest.
> 
> While, as per my earlier patch, I agree with the Xen PV case, I'm not
> convinced "nopat" is supposed to honor firmware-provided settings. In
> fact in my patch I did arrange for "nopat" to also take effect under
> Xen PV.

Depends on what the wanted semantics for "nopat" are.

Right now "nopat" will result in the PAT MSR left unchanged and the
cache mode translation tables be initialized accordingly.

So does "nopat" mean that the PAT MSR shouldn't be changed, or that
PAGE_BIT_PAT will never be set?

>> Add test functions for those caching modes instead and use them at the
>> appropriate places.
>>
>> For symmetry reasons export the already existing x86_has_pat_wp() for
>> modules, too.
>>
>> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
>> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
>> Signed-off-by: Juergen Gross <jgross@suse.com>
> 
> I think this wants a Reported-by as well.

Okay.

> 
>> --- a/arch/x86/include/asm/pci.h
>> +++ b/arch/x86/include/asm/pci.h
>> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
>>   
>>   
>>   #define HAVE_PCI_MMAP
>> -#define arch_can_pci_mmap_wc()	pat_enabled()
>> +#define arch_can_pci_mmap_wc()	x86_has_pat_wc()
> 
> Besides this and ...
> 
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>>   	if (args->flags & ~(I915_MMAP_WC))
>>   		return -EINVAL;
>>   
>> -	if (args->flags & I915_MMAP_WC && !pat_enabled())
>> +	if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>>   		return -ENODEV;
>>   
>>   	obj = i915_gem_object_lookup(file, args->handle);
>> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>>   
>>   	if (HAS_LMEM(to_i915(dev)))
>>   		mmap_type = I915_MMAP_TYPE_FIXED;
>> -	else if (pat_enabled())
>> +	else if (x86_has_pat_wc())
>>   		mmap_type = I915_MMAP_TYPE_WC;
>>   	else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>>   		return -ENODEV;
>> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>>   		break;
>>   
>>   	case I915_MMAP_OFFSET_WC:
>> -		if (!pat_enabled())
>> +		if (!x86_has_pat_wc())
>>   			return -ENODEV;
>>   		type = I915_MMAP_TYPE_WC;
>>   		break;
>> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>>   		break;
>>   
>>   	case I915_MMAP_OFFSET_UC:
>> -		if (!pat_enabled())
>> +		if (!x86_has_pat_uc_minus())
>>   			return -ENODEV;
>>   		type = I915_MMAP_TYPE_UC;
>>   		break;
> 
> ... these uses there are several more. You say nothing on why those want
> leaving unaltered. When preparing my earlier patch I did inspect them
> and came to the conclusion that these all would also better observe the
> adjusted behavior (or else I couldn't have left pat_enabled() as the only
> predicate). In fact, as said in the description of my earlier patch, in
> my debugging I did find the use in i915_gem_object_pin_map() to be the
> problematic one, which you leave alone.

Oh, I missed that one, sorry.

I wanted to be rather defensive in my changes, but I agree at least the
case in arch_phys_wc_add() might want to be changed, too.

kvm_is_mmio_pfn() should not really matter at least for the Xen case.

With the other use cases in memtype.c I'm rather on the edge.

In case the x86 maintainers think those should be changed, too, I agree
that your approach might be the better one.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-04  9:14       ` Juergen Gross
  0 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-04  9:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tvrtko Ursulin, Peter Zijlstra, intel-gfx, Dave Hansen, x86,
	linux-kernel, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, dri-devel, Andy Lutomirski, H. Peter Anvin,
	xen-devel, Thomas Gleixner


[-- Attachment #1.1.1: Type: text/plain, Size: 4029 bytes --]

On 04.05.22 10:31, Jan Beulich wrote:
> On 03.05.2022 15:22, Juergen Gross wrote:
>> Some drivers are using pat_enabled() in order to test availability of
>> special caching modes (WC and UC-). This will lead to false negatives
>> in case the system was booted e.g. with the "nopat" variant and the
>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>> system is running as a Xen PV guest.
> 
> While, as per my earlier patch, I agree with the Xen PV case, I'm not
> convinced "nopat" is supposed to honor firmware-provided settings. In
> fact in my patch I did arrange for "nopat" to also take effect under
> Xen PV.

Depends on what the wanted semantics for "nopat" are.

Right now "nopat" will result in the PAT MSR left unchanged and the
cache mode translation tables be initialized accordingly.

So does "nopat" mean that the PAT MSR shouldn't be changed, or that
PAGE_BIT_PAT will never be set?

>> Add test functions for those caching modes instead and use them at the
>> appropriate places.
>>
>> For symmetry reasons export the already existing x86_has_pat_wp() for
>> modules, too.
>>
>> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
>> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
>> Signed-off-by: Juergen Gross <jgross@suse.com>
> 
> I think this wants a Reported-by as well.

Okay.

> 
>> --- a/arch/x86/include/asm/pci.h
>> +++ b/arch/x86/include/asm/pci.h
>> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
>>   
>>   
>>   #define HAVE_PCI_MMAP
>> -#define arch_can_pci_mmap_wc()	pat_enabled()
>> +#define arch_can_pci_mmap_wc()	x86_has_pat_wc()
> 
> Besides this and ...
> 
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>>   	if (args->flags & ~(I915_MMAP_WC))
>>   		return -EINVAL;
>>   
>> -	if (args->flags & I915_MMAP_WC && !pat_enabled())
>> +	if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>>   		return -ENODEV;
>>   
>>   	obj = i915_gem_object_lookup(file, args->handle);
>> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>>   
>>   	if (HAS_LMEM(to_i915(dev)))
>>   		mmap_type = I915_MMAP_TYPE_FIXED;
>> -	else if (pat_enabled())
>> +	else if (x86_has_pat_wc())
>>   		mmap_type = I915_MMAP_TYPE_WC;
>>   	else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>>   		return -ENODEV;
>> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>>   		break;
>>   
>>   	case I915_MMAP_OFFSET_WC:
>> -		if (!pat_enabled())
>> +		if (!x86_has_pat_wc())
>>   			return -ENODEV;
>>   		type = I915_MMAP_TYPE_WC;
>>   		break;
>> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>>   		break;
>>   
>>   	case I915_MMAP_OFFSET_UC:
>> -		if (!pat_enabled())
>> +		if (!x86_has_pat_uc_minus())
>>   			return -ENODEV;
>>   		type = I915_MMAP_TYPE_UC;
>>   		break;
> 
> ... these uses there are several more. You say nothing on why those want
> leaving unaltered. When preparing my earlier patch I did inspect them
> and came to the conclusion that these all would also better observe the
> adjusted behavior (or else I couldn't have left pat_enabled() as the only
> predicate). In fact, as said in the description of my earlier patch, in
> my debugging I did find the use in i915_gem_object_pin_map() to be the
> problematic one, which you leave alone.

Oh, I missed that one, sorry.

I wanted to be rather defensive in my changes, but I agree at least the
case in arch_phys_wc_add() might want to be changed, too.

kvm_is_mmio_pfn() should not really matter at least for the Xen case.

With the other use cases in memtype.c I'm rather on the edge.

In case the x86 maintainers think those should be changed, too, I agree
that your approach might be the better one.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-04  9:14       ` Juergen Gross
  0 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-04  9:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Peter Zijlstra, intel-gfx, Dave Hansen, x86, linux-kernel,
	David Airlie, Rodrigo Vivi, Ingo Molnar, Borislav Petkov,
	dri-devel, Andy Lutomirski, H. Peter Anvin, xen-devel,
	Thomas Gleixner


[-- Attachment #1.1.1: Type: text/plain, Size: 4029 bytes --]

On 04.05.22 10:31, Jan Beulich wrote:
> On 03.05.2022 15:22, Juergen Gross wrote:
>> Some drivers are using pat_enabled() in order to test availability of
>> special caching modes (WC and UC-). This will lead to false negatives
>> in case the system was booted e.g. with the "nopat" variant and the
>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>> system is running as a Xen PV guest.
> 
> While, as per my earlier patch, I agree with the Xen PV case, I'm not
> convinced "nopat" is supposed to honor firmware-provided settings. In
> fact in my patch I did arrange for "nopat" to also take effect under
> Xen PV.

Depends on what the wanted semantics for "nopat" are.

Right now "nopat" will result in the PAT MSR left unchanged and the
cache mode translation tables be initialized accordingly.

So does "nopat" mean that the PAT MSR shouldn't be changed, or that
PAGE_BIT_PAT will never be set?

>> Add test functions for those caching modes instead and use them at the
>> appropriate places.
>>
>> For symmetry reasons export the already existing x86_has_pat_wp() for
>> modules, too.
>>
>> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
>> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
>> Signed-off-by: Juergen Gross <jgross@suse.com>
> 
> I think this wants a Reported-by as well.

Okay.

> 
>> --- a/arch/x86/include/asm/pci.h
>> +++ b/arch/x86/include/asm/pci.h
>> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
>>   
>>   
>>   #define HAVE_PCI_MMAP
>> -#define arch_can_pci_mmap_wc()	pat_enabled()
>> +#define arch_can_pci_mmap_wc()	x86_has_pat_wc()
> 
> Besides this and ...
> 
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>>   	if (args->flags & ~(I915_MMAP_WC))
>>   		return -EINVAL;
>>   
>> -	if (args->flags & I915_MMAP_WC && !pat_enabled())
>> +	if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>>   		return -ENODEV;
>>   
>>   	obj = i915_gem_object_lookup(file, args->handle);
>> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>>   
>>   	if (HAS_LMEM(to_i915(dev)))
>>   		mmap_type = I915_MMAP_TYPE_FIXED;
>> -	else if (pat_enabled())
>> +	else if (x86_has_pat_wc())
>>   		mmap_type = I915_MMAP_TYPE_WC;
>>   	else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>>   		return -ENODEV;
>> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>>   		break;
>>   
>>   	case I915_MMAP_OFFSET_WC:
>> -		if (!pat_enabled())
>> +		if (!x86_has_pat_wc())
>>   			return -ENODEV;
>>   		type = I915_MMAP_TYPE_WC;
>>   		break;
>> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>>   		break;
>>   
>>   	case I915_MMAP_OFFSET_UC:
>> -		if (!pat_enabled())
>> +		if (!x86_has_pat_uc_minus())
>>   			return -ENODEV;
>>   		type = I915_MMAP_TYPE_UC;
>>   		break;
> 
> ... these uses there are several more. You say nothing on why those want
> leaving unaltered. When preparing my earlier patch I did inspect them
> and came to the conclusion that these all would also better observe the
> adjusted behavior (or else I couldn't have left pat_enabled() as the only
> predicate). In fact, as said in the description of my earlier patch, in
> my debugging I did find the use in i915_gem_object_pin_map() to be the
> problematic one, which you leave alone.

Oh, I missed that one, sorry.

I wanted to be rather defensive in my changes, but I agree at least the
case in arch_phys_wc_add() might want to be changed, too.

kvm_is_mmio_pfn() should not really matter at least for the Xen case.

With the other use cases in memtype.c I'm rather on the edge.

In case the x86 maintainers think those should be changed, too, I agree
that your approach might be the better one.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-04  9:14       ` Juergen Gross
  (?)
@ 2022-05-04  9:51         ` Jan Beulich
  -1 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-04  9:51 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel

On 04.05.2022 11:14, Juergen Gross wrote:
> On 04.05.22 10:31, Jan Beulich wrote:
>> On 03.05.2022 15:22, Juergen Gross wrote:
>>> Some drivers are using pat_enabled() in order to test availability of
>>> special caching modes (WC and UC-). This will lead to false negatives
>>> in case the system was booted e.g. with the "nopat" variant and the
>>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>>> system is running as a Xen PV guest.
>>
>> While, as per my earlier patch, I agree with the Xen PV case, I'm not
>> convinced "nopat" is supposed to honor firmware-provided settings. In
>> fact in my patch I did arrange for "nopat" to also take effect under
>> Xen PV.
> 
> Depends on what the wanted semantics for "nopat" are.
> 
> Right now "nopat" will result in the PAT MSR left unchanged and the
> cache mode translation tables be initialized accordingly.
> 
> So does "nopat" mean that the PAT MSR shouldn't be changed, or that
> PAGE_BIT_PAT will never be set?

According to the documentation for the option ("Disable PAT (page
attribute table extension of pagetables) support") I'd say the latter.

Jan


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-04  9:51         ` Jan Beulich
  0 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-04  9:51 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Tvrtko Ursulin, Peter Zijlstra, intel-gfx, Dave Hansen, x86,
	linux-kernel, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, dri-devel, Andy Lutomirski, H. Peter Anvin,
	xen-devel, Thomas Gleixner

On 04.05.2022 11:14, Juergen Gross wrote:
> On 04.05.22 10:31, Jan Beulich wrote:
>> On 03.05.2022 15:22, Juergen Gross wrote:
>>> Some drivers are using pat_enabled() in order to test availability of
>>> special caching modes (WC and UC-). This will lead to false negatives
>>> in case the system was booted e.g. with the "nopat" variant and the
>>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>>> system is running as a Xen PV guest.
>>
>> While, as per my earlier patch, I agree with the Xen PV case, I'm not
>> convinced "nopat" is supposed to honor firmware-provided settings. In
>> fact in my patch I did arrange for "nopat" to also take effect under
>> Xen PV.
> 
> Depends on what the wanted semantics for "nopat" are.
> 
> Right now "nopat" will result in the PAT MSR left unchanged and the
> cache mode translation tables be initialized accordingly.
> 
> So does "nopat" mean that the PAT MSR shouldn't be changed, or that
> PAGE_BIT_PAT will never be set?

According to the documentation for the option ("Disable PAT (page
attribute table extension of pagetables) support") I'd say the latter.

Jan


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-04  9:51         ` Jan Beulich
  0 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-04  9:51 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Peter Zijlstra, intel-gfx, Dave Hansen, x86, linux-kernel,
	David Airlie, Rodrigo Vivi, Ingo Molnar, Borislav Petkov,
	dri-devel, Andy Lutomirski, H. Peter Anvin, xen-devel,
	Thomas Gleixner

On 04.05.2022 11:14, Juergen Gross wrote:
> On 04.05.22 10:31, Jan Beulich wrote:
>> On 03.05.2022 15:22, Juergen Gross wrote:
>>> Some drivers are using pat_enabled() in order to test availability of
>>> special caching modes (WC and UC-). This will lead to false negatives
>>> in case the system was booted e.g. with the "nopat" variant and the
>>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>>> system is running as a Xen PV guest.
>>
>> While, as per my earlier patch, I agree with the Xen PV case, I'm not
>> convinced "nopat" is supposed to honor firmware-provided settings. In
>> fact in my patch I did arrange for "nopat" to also take effect under
>> Xen PV.
> 
> Depends on what the wanted semantics for "nopat" are.
> 
> Right now "nopat" will result in the PAT MSR left unchanged and the
> cache mode translation tables be initialized accordingly.
> 
> So does "nopat" mean that the PAT MSR shouldn't be changed, or that
> PAGE_BIT_PAT will never be set?

According to the documentation for the option ("Disable PAT (page
attribute table extension of pagetables) support") I'd say the latter.

Jan


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-03 13:22   ` Juergen Gross
@ 2022-05-18 13:45     ` Christoph Hellwig
  -1 siblings, 0 replies; 80+ messages in thread
From: Christoph Hellwig @ 2022-05-18 13:45 UTC (permalink / raw)
  To: Juergen Gross
  Cc: xen-devel, x86, linux-kernel, intel-gfx, dri-devel, jbeulich,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter

On Tue, May 03, 2022 at 03:22:07PM +0200, Juergen Gross wrote:
> Some drivers are using pat_enabled() in order to test availability of
> special caching modes (WC and UC-). This will lead to false negatives
> in case the system was booted e.g. with the "nopat" variant and the
> BIOS did setup the PAT MSR supporting the queried mode, or if the
> system is running as a Xen PV guest.
> 
> Add test functions for those caching modes instead and use them at the
> appropriate places.
> 
> For symmetry reasons export the already existing x86_has_pat_wp() for
> modules, too.

No, we never export unused functionality.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-18 13:45     ` Christoph Hellwig
  0 siblings, 0 replies; 80+ messages in thread
From: Christoph Hellwig @ 2022-05-18 13:45 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, intel-gfx, x86,
	linux-kernel, dri-devel, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, jbeulich, H. Peter Anvin, xen-devel,
	Thomas Gleixner

On Tue, May 03, 2022 at 03:22:07PM +0200, Juergen Gross wrote:
> Some drivers are using pat_enabled() in order to test availability of
> special caching modes (WC and UC-). This will lead to false negatives
> in case the system was booted e.g. with the "nopat" variant and the
> BIOS did setup the PAT MSR supporting the queried mode, or if the
> system is running as a Xen PV guest.
> 
> Add test functions for those caching modes instead and use them at the
> appropriate places.
> 
> For symmetry reasons export the already existing x86_has_pat_wp() for
> modules, too.

No, we never export unused functionality.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-03 13:22   ` Juergen Gross
@ 2022-05-20  2:15     ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20  2:15 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, x86, linux-kernel, intel-gfx, dri-devel
  Cc: jbeulich, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Andy Lutomirski, Peter Zijlstra,
	Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin,
	David Airlie, Daniel Vetter, pkg-xen-devel

On 5/3/22 9:22 AM, Juergen Gross wrote:
> Some drivers are using pat_enabled() in order to test availability of
> special caching modes (WC and UC-). This will lead to false negatives
> in case the system was booted e.g. with the "nopat" variant and the
> BIOS did setup the PAT MSR supporting the queried mode, or if the
> system is running as a Xen PV guest.
Hello,

I am also getting a false positive in a Xen Dom0 from
pat_enabled() where bdd8b6c98239 patched the file

drivers/gpu/drm/i915/gem/i915_gem_pages.c

I think this patch also needs to touch that file to
fix the issue I am seeing.

Ever since bdd8b6c98239 was committed, I get the
following in the logs when running as a Dom0 on my
Haswell processor, including with the untainted
official Debian build of 5.17.6:

May 15 06:31:59 debian kernel: [    3.721146] i915 0000:00:02.0: 
[drm:add_taint_for_CI [i915]] CI tainted:0x9 by intel_gt_init+0xb6/0x2e0 
[i915]

This causes the system to hang with the backlight on.
The only recovery is by hitting the reset button and
rebooting Linux Dom0 on Xen with Linux version 5.16
or earlier, or by rebooting Linux version 5.17 without
Xen.

I was able to fix it with a kernel that fixes the
false negative I was getting in
drivers/gpu/drm/i915/gem/i915_gem_pages.c

Can the patch also touch that file, replacing
pat_enabled() with x86_has_pat_wc() in the
place where bdd8b6c98239 patched that file?

Thanks,

Chuck Zmudzinski

>
> Add test functions for those caching modes instead and use them at the
> appropriate places.
>
> For symmetry reasons export the already existing x86_has_pat_wp() for
> modules, too.
>
> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>   arch/x86/include/asm/memtype.h           |  2 ++
>   arch/x86/include/asm/pci.h               |  2 +-
>   arch/x86/mm/init.c                       | 25 +++++++++++++++++++++---
>   drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 ++++----
>   4 files changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/memtype.h b/arch/x86/include/asm/memtype.h
> index 9ca760e430b9..d00e0be854d4 100644
> --- a/arch/x86/include/asm/memtype.h
> +++ b/arch/x86/include/asm/memtype.h
> @@ -25,6 +25,8 @@ extern void memtype_free_io(resource_size_t start, resource_size_t end);
>   extern bool pat_pfn_immune_to_uc_mtrr(unsigned long pfn);
>   
>   bool x86_has_pat_wp(void);
> +bool x86_has_pat_wc(void);
> +bool x86_has_pat_uc_minus(void);
>   enum page_cache_mode pgprot2cachemode(pgprot_t pgprot);
>   
>   #endif /* _ASM_X86_MEMTYPE_H */
> diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
> index f3fd5928bcbb..a5742268dec1 100644
> --- a/arch/x86/include/asm/pci.h
> +++ b/arch/x86/include/asm/pci.h
> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
>   
>   
>   #define HAVE_PCI_MMAP
> -#define arch_can_pci_mmap_wc()	pat_enabled()
> +#define arch_can_pci_mmap_wc()	x86_has_pat_wc()
>   #define ARCH_GENERIC_PCI_MMAP_RESOURCE
>   
>   #ifdef CONFIG_PCI
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index 71e182ebced3..b6431f714dc2 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -77,12 +77,31 @@ static uint8_t __pte2cachemode_tbl[8] = {
>   	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
>   };
>   
> -/* Check that the write-protect PAT entry is set for write-protect */
> +static bool x86_has_pat_mode(unsigned int mode)
> +{
> +	return __pte2cachemode_tbl[__cachemode2pte_tbl[mode]] == mode;
> +}
> +
> +/* Check that PAT supports write-protect */
>   bool x86_has_pat_wp(void)
>   {
> -	return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
> -	       _PAGE_CACHE_MODE_WP;
> +	return x86_has_pat_mode(_PAGE_CACHE_MODE_WP);
> +}
> +EXPORT_SYMBOL_GPL(x86_has_pat_wp);
> +
> +/* Check that PAT supports WC */
> +bool x86_has_pat_wc(void)
> +{
> +	return x86_has_pat_mode(_PAGE_CACHE_MODE_WC);
> +}
> +EXPORT_SYMBOL_GPL(x86_has_pat_wc);
> +
> +/* Check that PAT supports UC- */
> +bool x86_has_pat_uc_minus(void)
> +{
> +	return x86_has_pat_mode(_PAGE_CACHE_MODE_UC_MINUS);
>   }
> +EXPORT_SYMBOL_GPL(x86_has_pat_uc_minus);
>   
>   enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
>   {
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> index 0c5c43852e24..f43ecf3f63eb 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>   	if (args->flags & ~(I915_MMAP_WC))
>   		return -EINVAL;
>   
> -	if (args->flags & I915_MMAP_WC && !pat_enabled())
> +	if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>   		return -ENODEV;
>   
>   	obj = i915_gem_object_lookup(file, args->handle);
> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>   
>   	if (HAS_LMEM(to_i915(dev)))
>   		mmap_type = I915_MMAP_TYPE_FIXED;
> -	else if (pat_enabled())
> +	else if (x86_has_pat_wc())
>   		mmap_type = I915_MMAP_TYPE_WC;
>   	else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>   		return -ENODEV;
> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>   		break;
>   
>   	case I915_MMAP_OFFSET_WC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_wc())
>   			return -ENODEV;
>   		type = I915_MMAP_TYPE_WC;
>   		break;
> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>   		break;
>   
>   	case I915_MMAP_OFFSET_UC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_uc_minus())
>   			return -ENODEV;
>   		type = I915_MMAP_TYPE_UC;
>   		break;


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20  2:15     ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20  2:15 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, x86, linux-kernel, intel-gfx, dri-devel
  Cc: Tvrtko Ursulin, Andy Lutomirski, pkg-xen-devel, Peter Zijlstra,
	Dave Hansen, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, jbeulich, H. Peter Anvin, Thomas Gleixner

On 5/3/22 9:22 AM, Juergen Gross wrote:
> Some drivers are using pat_enabled() in order to test availability of
> special caching modes (WC and UC-). This will lead to false negatives
> in case the system was booted e.g. with the "nopat" variant and the
> BIOS did setup the PAT MSR supporting the queried mode, or if the
> system is running as a Xen PV guest.
Hello,

I am also getting a false positive in a Xen Dom0 from
pat_enabled() where bdd8b6c98239 patched the file

drivers/gpu/drm/i915/gem/i915_gem_pages.c

I think this patch also needs to touch that file to
fix the issue I am seeing.

Ever since bdd8b6c98239 was committed, I get the
following in the logs when running as a Dom0 on my
Haswell processor, including with the untainted
official Debian build of 5.17.6:

May 15 06:31:59 debian kernel: [    3.721146] i915 0000:00:02.0: 
[drm:add_taint_for_CI [i915]] CI tainted:0x9 by intel_gt_init+0xb6/0x2e0 
[i915]

This causes the system to hang with the backlight on.
The only recovery is by hitting the reset button and
rebooting Linux Dom0 on Xen with Linux version 5.16
or earlier, or by rebooting Linux version 5.17 without
Xen.

I was able to fix it with a kernel that fixes the
false negative I was getting in
drivers/gpu/drm/i915/gem/i915_gem_pages.c

Can the patch also touch that file, replacing
pat_enabled() with x86_has_pat_wc() in the
place where bdd8b6c98239 patched that file?

Thanks,

Chuck Zmudzinski

>
> Add test functions for those caching modes instead and use them at the
> appropriate places.
>
> For symmetry reasons export the already existing x86_has_pat_wp() for
> modules, too.
>
> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>   arch/x86/include/asm/memtype.h           |  2 ++
>   arch/x86/include/asm/pci.h               |  2 +-
>   arch/x86/mm/init.c                       | 25 +++++++++++++++++++++---
>   drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 ++++----
>   4 files changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/memtype.h b/arch/x86/include/asm/memtype.h
> index 9ca760e430b9..d00e0be854d4 100644
> --- a/arch/x86/include/asm/memtype.h
> +++ b/arch/x86/include/asm/memtype.h
> @@ -25,6 +25,8 @@ extern void memtype_free_io(resource_size_t start, resource_size_t end);
>   extern bool pat_pfn_immune_to_uc_mtrr(unsigned long pfn);
>   
>   bool x86_has_pat_wp(void);
> +bool x86_has_pat_wc(void);
> +bool x86_has_pat_uc_minus(void);
>   enum page_cache_mode pgprot2cachemode(pgprot_t pgprot);
>   
>   #endif /* _ASM_X86_MEMTYPE_H */
> diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
> index f3fd5928bcbb..a5742268dec1 100644
> --- a/arch/x86/include/asm/pci.h
> +++ b/arch/x86/include/asm/pci.h
> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
>   
>   
>   #define HAVE_PCI_MMAP
> -#define arch_can_pci_mmap_wc()	pat_enabled()
> +#define arch_can_pci_mmap_wc()	x86_has_pat_wc()
>   #define ARCH_GENERIC_PCI_MMAP_RESOURCE
>   
>   #ifdef CONFIG_PCI
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index 71e182ebced3..b6431f714dc2 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -77,12 +77,31 @@ static uint8_t __pte2cachemode_tbl[8] = {
>   	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
>   };
>   
> -/* Check that the write-protect PAT entry is set for write-protect */
> +static bool x86_has_pat_mode(unsigned int mode)
> +{
> +	return __pte2cachemode_tbl[__cachemode2pte_tbl[mode]] == mode;
> +}
> +
> +/* Check that PAT supports write-protect */
>   bool x86_has_pat_wp(void)
>   {
> -	return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
> -	       _PAGE_CACHE_MODE_WP;
> +	return x86_has_pat_mode(_PAGE_CACHE_MODE_WP);
> +}
> +EXPORT_SYMBOL_GPL(x86_has_pat_wp);
> +
> +/* Check that PAT supports WC */
> +bool x86_has_pat_wc(void)
> +{
> +	return x86_has_pat_mode(_PAGE_CACHE_MODE_WC);
> +}
> +EXPORT_SYMBOL_GPL(x86_has_pat_wc);
> +
> +/* Check that PAT supports UC- */
> +bool x86_has_pat_uc_minus(void)
> +{
> +	return x86_has_pat_mode(_PAGE_CACHE_MODE_UC_MINUS);
>   }
> +EXPORT_SYMBOL_GPL(x86_has_pat_uc_minus);
>   
>   enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
>   {
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> index 0c5c43852e24..f43ecf3f63eb 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>   	if (args->flags & ~(I915_MMAP_WC))
>   		return -EINVAL;
>   
> -	if (args->flags & I915_MMAP_WC && !pat_enabled())
> +	if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>   		return -ENODEV;
>   
>   	obj = i915_gem_object_lookup(file, args->handle);
> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>   
>   	if (HAS_LMEM(to_i915(dev)))
>   		mmap_type = I915_MMAP_TYPE_FIXED;
> -	else if (pat_enabled())
> +	else if (x86_has_pat_wc())
>   		mmap_type = I915_MMAP_TYPE_WC;
>   	else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>   		return -ENODEV;
> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>   		break;
>   
>   	case I915_MMAP_OFFSET_WC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_wc())
>   			return -ENODEV;
>   		type = I915_MMAP_TYPE_WC;
>   		break;
> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>   		break;
>   
>   	case I915_MMAP_OFFSET_UC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_uc_minus())
>   			return -ENODEV;
>   		type = I915_MMAP_TYPE_UC;
>   		break;


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-20  2:15     ` Chuck Zmudzinski
@ 2022-05-20  2:21       ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20  2:21 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, x86, linux-kernel, intel-gfx, dri-devel
  Cc: jbeulich, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Andy Lutomirski, Peter Zijlstra,
	Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin,
	David Airlie, Daniel Vetter, pkg-xen-devel

On 5/19/22 10:15 PM, Chuck Zmudzinski wrote:
> On 5/3/22 9:22 AM, Juergen Gross wrote:
>> Some drivers are using pat_enabled() in order to test availability of
>> special caching modes (WC and UC-). This will lead to false negatives
>> in case the system was booted e.g. with the "nopat" variant and the
>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>> system is running as a Xen PV guest.
> Hello,
>
> I am also getting a false positive

Sorry, I meant false negative here, not false
positive.

Chuck

> in a Xen Dom0 from
> pat_enabled() where bdd8b6c98239 patched the file
>
> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>
> I think this patch also needs to touch that file to
> fix the issue I am seeing.
...


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20  2:21       ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20  2:21 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, x86, linux-kernel, intel-gfx, dri-devel
  Cc: Tvrtko Ursulin, Andy Lutomirski, pkg-xen-devel, Peter Zijlstra,
	Dave Hansen, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, jbeulich, H. Peter Anvin, Thomas Gleixner

On 5/19/22 10:15 PM, Chuck Zmudzinski wrote:
> On 5/3/22 9:22 AM, Juergen Gross wrote:
>> Some drivers are using pat_enabled() in order to test availability of
>> special caching modes (WC and UC-). This will lead to false negatives
>> in case the system was booted e.g. with the "nopat" variant and the
>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>> system is running as a Xen PV guest.
> Hello,
>
> I am also getting a false positive

Sorry, I meant false negative here, not false
positive.

Chuck

> in a Xen Dom0 from
> pat_enabled() where bdd8b6c98239 patched the file
>
> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>
> I think this patch also needs to touch that file to
> fix the issue I am seeing.
...


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-04  9:14       ` Juergen Gross
@ 2022-05-20  4:43         ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20  4:43 UTC (permalink / raw)
  To: Juergen Gross, Jan Beulich
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel

On 5/4/22 5:14 AM, Juergen Gross wrote:
> On 04.05.22 10:31, Jan Beulich wrote:
>> On 03.05.2022 15:22, Juergen Gross wrote:
>>> Some drivers are using pat_enabled() in order to test availability of
>>> special caching modes (WC and UC-). This will lead to false negatives
>>> in case the system was booted e.g. with the "nopat" variant and the
>>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>>> system is running as a Xen PV guest.
>> ...
>>> Add test functions for those caching modes instead and use them at the
>>> appropriate places.
>>>
>>> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with 
>>> pat_enabled()")
>>> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ...
>>
>>> --- a/arch/x86/include/asm/pci.h
>>> +++ b/arch/x86/include/asm/pci.h
>>> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, 
>>> int pin, int irq);
>>>       #define HAVE_PCI_MMAP
>>> -#define arch_can_pci_mmap_wc()    pat_enabled()
>>> +#define arch_can_pci_mmap_wc()    x86_has_pat_wc()
>>
>> Besides this and ...
>>
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void 
>>> *data,
>>>       if (args->flags & ~(I915_MMAP_WC))
>>>           return -EINVAL;
>>>   -    if (args->flags & I915_MMAP_WC && !pat_enabled())
>>> +    if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>>>           return -ENODEV;
>>>         obj = i915_gem_object_lookup(file, args->handle);
>>> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>>>         if (HAS_LMEM(to_i915(dev)))
>>>           mmap_type = I915_MMAP_TYPE_FIXED;
>>> -    else if (pat_enabled())
>>> +    else if (x86_has_pat_wc())
>>>           mmap_type = I915_MMAP_TYPE_WC;
>>>       else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>>>           return -ENODEV;
>>> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device 
>>> *dev, void *data,
>>>           break;
>>>         case I915_MMAP_OFFSET_WC:
>>> -        if (!pat_enabled())
>>> +        if (!x86_has_pat_wc())
>>>               return -ENODEV;
>>>           type = I915_MMAP_TYPE_WC;
>>>           break;
>>> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device 
>>> *dev, void *data,
>>>           break;
>>>         case I915_MMAP_OFFSET_UC:
>>> -        if (!pat_enabled())
>>> +        if (!x86_has_pat_uc_minus())
>>>               return -ENODEV;
>>>           type = I915_MMAP_TYPE_UC;
>>>           break;
>>
>> ... these uses there are several more. You say nothing on why those want
>> leaving unaltered. When preparing my earlier patch I did inspect them
>> and came to the conclusion that these all would also better observe the
>> adjusted behavior (or else I couldn't have left pat_enabled() as the 
>> only
>> predicate). In fact, as said in the description of my earlier patch, in
>> my debugging I did find the use in i915_gem_object_pin_map() to be the
>> problematic one, which you leave alone.
>
> Oh, I missed that one, sorry.

That is why your patch would not fix my Haswell unless
it also touches i915_gem_object_pin_map() in
drivers/gpu/drm/i915/gem/i915_gem_pages.c

>
> I wanted to be rather defensive in my changes, but I agree at least the
> case in arch_phys_wc_add() might want to be changed, too.

I think your approach needs to be more aggressive so it will fix
all the known false negatives introduced by bdd8b6c98239
such as the one in i915_gem_object_pin_map().

I looked at Jan's approach and I think it would fix the issue
with my Haswell as long as I don't use the nopat option. I
really don't have a strong opinion on that question, but I
think the nopat option as a Linux kernel option, as opposed
to a hypervisor option, should only affect the kernel, and
if the hypervisor provides the pat feature, then the kernel
should not override that, but because of the confusion, maybe
a warning could be printed with the nopat option when a
hypervisor provides the feature so the user can at least have a
knob to tweak if if does not behave the way the user intends.
But I must admit, I don't know if the Xen hypervisor has an
option also to disable pat. If not, then maybe Jan's more
aggressive approach with nopat might be needed if for
some reason pat really needs to be disabled in the Linux
when Linux is running on Xen or another hypervisor, but I don't
know of any cases when that would be needed.

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20  4:43         ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20  4:43 UTC (permalink / raw)
  To: Juergen Gross, Jan Beulich
  Cc: Tvrtko Ursulin, Peter Zijlstra, intel-gfx, Dave Hansen, x86,
	linux-kernel, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, dri-devel, Andy Lutomirski, H. Peter Anvin,
	xen-devel, Thomas Gleixner

On 5/4/22 5:14 AM, Juergen Gross wrote:
> On 04.05.22 10:31, Jan Beulich wrote:
>> On 03.05.2022 15:22, Juergen Gross wrote:
>>> Some drivers are using pat_enabled() in order to test availability of
>>> special caching modes (WC and UC-). This will lead to false negatives
>>> in case the system was booted e.g. with the "nopat" variant and the
>>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>>> system is running as a Xen PV guest.
>> ...
>>> Add test functions for those caching modes instead and use them at the
>>> appropriate places.
>>>
>>> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with 
>>> pat_enabled()")
>>> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ...
>>
>>> --- a/arch/x86/include/asm/pci.h
>>> +++ b/arch/x86/include/asm/pci.h
>>> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, 
>>> int pin, int irq);
>>>       #define HAVE_PCI_MMAP
>>> -#define arch_can_pci_mmap_wc()    pat_enabled()
>>> +#define arch_can_pci_mmap_wc()    x86_has_pat_wc()
>>
>> Besides this and ...
>>
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void 
>>> *data,
>>>       if (args->flags & ~(I915_MMAP_WC))
>>>           return -EINVAL;
>>>   -    if (args->flags & I915_MMAP_WC && !pat_enabled())
>>> +    if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>>>           return -ENODEV;
>>>         obj = i915_gem_object_lookup(file, args->handle);
>>> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>>>         if (HAS_LMEM(to_i915(dev)))
>>>           mmap_type = I915_MMAP_TYPE_FIXED;
>>> -    else if (pat_enabled())
>>> +    else if (x86_has_pat_wc())
>>>           mmap_type = I915_MMAP_TYPE_WC;
>>>       else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>>>           return -ENODEV;
>>> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device 
>>> *dev, void *data,
>>>           break;
>>>         case I915_MMAP_OFFSET_WC:
>>> -        if (!pat_enabled())
>>> +        if (!x86_has_pat_wc())
>>>               return -ENODEV;
>>>           type = I915_MMAP_TYPE_WC;
>>>           break;
>>> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device 
>>> *dev, void *data,
>>>           break;
>>>         case I915_MMAP_OFFSET_UC:
>>> -        if (!pat_enabled())
>>> +        if (!x86_has_pat_uc_minus())
>>>               return -ENODEV;
>>>           type = I915_MMAP_TYPE_UC;
>>>           break;
>>
>> ... these uses there are several more. You say nothing on why those want
>> leaving unaltered. When preparing my earlier patch I did inspect them
>> and came to the conclusion that these all would also better observe the
>> adjusted behavior (or else I couldn't have left pat_enabled() as the 
>> only
>> predicate). In fact, as said in the description of my earlier patch, in
>> my debugging I did find the use in i915_gem_object_pin_map() to be the
>> problematic one, which you leave alone.
>
> Oh, I missed that one, sorry.

That is why your patch would not fix my Haswell unless
it also touches i915_gem_object_pin_map() in
drivers/gpu/drm/i915/gem/i915_gem_pages.c

>
> I wanted to be rather defensive in my changes, but I agree at least the
> case in arch_phys_wc_add() might want to be changed, too.

I think your approach needs to be more aggressive so it will fix
all the known false negatives introduced by bdd8b6c98239
such as the one in i915_gem_object_pin_map().

I looked at Jan's approach and I think it would fix the issue
with my Haswell as long as I don't use the nopat option. I
really don't have a strong opinion on that question, but I
think the nopat option as a Linux kernel option, as opposed
to a hypervisor option, should only affect the kernel, and
if the hypervisor provides the pat feature, then the kernel
should not override that, but because of the confusion, maybe
a warning could be printed with the nopat option when a
hypervisor provides the feature so the user can at least have a
knob to tweak if if does not behave the way the user intends.
But I must admit, I don't know if the Xen hypervisor has an
option also to disable pat. If not, then maybe Jan's more
aggressive approach with nopat might be needed if for
some reason pat really needs to be disabled in the Linux
when Linux is running on Xen or another hypervisor, but I don't
know of any cases when that would be needed.

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-20  4:43         ` Chuck Zmudzinski
@ 2022-05-20  5:56           ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20  5:56 UTC (permalink / raw)
  To: Juergen Gross, Jan Beulich
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel

On 5/20/22 12:43 AM, Chuck Zmudzinski wrote:
> On 5/4/22 5:14 AM, Juergen Gross wrote:
>> On 04.05.22 10:31, Jan Beulich wrote:
>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>> Some drivers are using pat_enabled() in order to test availability of
>>>> special caching modes (WC and UC-). This will lead to false negatives
>>>> in case the system was booted e.g. with the "nopat" variant and the
>>>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>>>> system is running as a Xen PV guest.
>>> ...
>>>> Add test functions for those caching modes instead and use them at the
>>>> appropriate places.
>>>>
>>>> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with 
>>>> pat_enabled()")
>>>> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>> ...
>>>
>>>
>>> ... these uses there are several more. You say nothing on why those 
>>> want
>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>> and came to the conclusion that these all would also better observe the
>>> adjusted behavior (or else I couldn't have left pat_enabled() as the 
>>> only
>>> predicate). In fact, as said in the description of my earlier patch, in
>>> my debugging I did find the use in i915_gem_object_pin_map() to be the
>>> problematic one, which you leave alone.
>>
>> Oh, I missed that one, sorry.
>
> That is why your patch would not fix my Haswell unless
> it also touches i915_gem_object_pin_map() in
> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>
>>
>> I wanted to be rather defensive in my changes, but I agree at least the
>> case in arch_phys_wc_add() might want to be changed, too.
>
> I think your approach needs to be more aggressive so it will fix
> all the known false negatives introduced by bdd8b6c98239
> such as the one in i915_gem_object_pin_map().
>
> I looked at Jan's approach and I think it would fix the issue
> with my Haswell as long as I don't use the nopat option. I
> really don't have a strong opinion on that question, but I
> think the nopat option as a Linux kernel option, as opposed
> to a hypervisor option, should only affect the kernel, and
> if the hypervisor provides the pat feature, then the kernel
> should not override that, but because of the confusion,

The confusion is: does "nopat" only mean the kernel does not
provide pat to device drivers, or does it mean kernel drivers
are not to use pat even if the hypervisor provides it?
I think the original purpose of bdd8b6c98239 was to
enable "nopat" to disable the use or pat in the i915 driver
even if the feature is present from either the kernel or the
hypervisor. This interpretation of the meaning of "nopat"
would favor Jan's approach, I think.

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20  5:56           ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20  5:56 UTC (permalink / raw)
  To: Juergen Gross, Jan Beulich
  Cc: Tvrtko Ursulin, Peter Zijlstra, intel-gfx, Dave Hansen, x86,
	linux-kernel, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, dri-devel, Andy Lutomirski, H. Peter Anvin,
	xen-devel, Thomas Gleixner

On 5/20/22 12:43 AM, Chuck Zmudzinski wrote:
> On 5/4/22 5:14 AM, Juergen Gross wrote:
>> On 04.05.22 10:31, Jan Beulich wrote:
>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>> Some drivers are using pat_enabled() in order to test availability of
>>>> special caching modes (WC and UC-). This will lead to false negatives
>>>> in case the system was booted e.g. with the "nopat" variant and the
>>>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>>>> system is running as a Xen PV guest.
>>> ...
>>>> Add test functions for those caching modes instead and use them at the
>>>> appropriate places.
>>>>
>>>> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with 
>>>> pat_enabled()")
>>>> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>> ...
>>>
>>>
>>> ... these uses there are several more. You say nothing on why those 
>>> want
>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>> and came to the conclusion that these all would also better observe the
>>> adjusted behavior (or else I couldn't have left pat_enabled() as the 
>>> only
>>> predicate). In fact, as said in the description of my earlier patch, in
>>> my debugging I did find the use in i915_gem_object_pin_map() to be the
>>> problematic one, which you leave alone.
>>
>> Oh, I missed that one, sorry.
>
> That is why your patch would not fix my Haswell unless
> it also touches i915_gem_object_pin_map() in
> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>
>>
>> I wanted to be rather defensive in my changes, but I agree at least the
>> case in arch_phys_wc_add() might want to be changed, too.
>
> I think your approach needs to be more aggressive so it will fix
> all the known false negatives introduced by bdd8b6c98239
> such as the one in i915_gem_object_pin_map().
>
> I looked at Jan's approach and I think it would fix the issue
> with my Haswell as long as I don't use the nopat option. I
> really don't have a strong opinion on that question, but I
> think the nopat option as a Linux kernel option, as opposed
> to a hypervisor option, should only affect the kernel, and
> if the hypervisor provides the pat feature, then the kernel
> should not override that, but because of the confusion,

The confusion is: does "nopat" only mean the kernel does not
provide pat to device drivers, or does it mean kernel drivers
are not to use pat even if the hypervisor provides it?
I think the original purpose of bdd8b6c98239 was to
enable "nopat" to disable the use or pat in the i915 driver
even if the feature is present from either the kernel or the
hypervisor. This interpretation of the meaning of "nopat"
would favor Jan's approach, I think.

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-20  4:43         ` Chuck Zmudzinski
@ 2022-05-20  6:05           ` Jan Beulich
  -1 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-20  6:05 UTC (permalink / raw)
  To: Chuck Zmudzinski
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross

On 20.05.2022 06:43, Chuck Zmudzinski wrote:
> On 5/4/22 5:14 AM, Juergen Gross wrote:
>> On 04.05.22 10:31, Jan Beulich wrote:
>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>> Some drivers are using pat_enabled() in order to test availability of
>>>> special caching modes (WC and UC-). This will lead to false negatives
>>>> in case the system was booted e.g. with the "nopat" variant and the
>>>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>>>> system is running as a Xen PV guest.
>>> ...
>>>> Add test functions for those caching modes instead and use them at the
>>>> appropriate places.
>>>>
>>>> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with 
>>>> pat_enabled()")
>>>> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>> ...
>>>
>>>> --- a/arch/x86/include/asm/pci.h
>>>> +++ b/arch/x86/include/asm/pci.h
>>>> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, 
>>>> int pin, int irq);
>>>>       #define HAVE_PCI_MMAP
>>>> -#define arch_can_pci_mmap_wc()    pat_enabled()
>>>> +#define arch_can_pci_mmap_wc()    x86_has_pat_wc()
>>>
>>> Besides this and ...
>>>
>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>>> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void 
>>>> *data,
>>>>       if (args->flags & ~(I915_MMAP_WC))
>>>>           return -EINVAL;
>>>>   -    if (args->flags & I915_MMAP_WC && !pat_enabled())
>>>> +    if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>>>>           return -ENODEV;
>>>>         obj = i915_gem_object_lookup(file, args->handle);
>>>> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>>>>         if (HAS_LMEM(to_i915(dev)))
>>>>           mmap_type = I915_MMAP_TYPE_FIXED;
>>>> -    else if (pat_enabled())
>>>> +    else if (x86_has_pat_wc())
>>>>           mmap_type = I915_MMAP_TYPE_WC;
>>>>       else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>>>>           return -ENODEV;
>>>> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device 
>>>> *dev, void *data,
>>>>           break;
>>>>         case I915_MMAP_OFFSET_WC:
>>>> -        if (!pat_enabled())
>>>> +        if (!x86_has_pat_wc())
>>>>               return -ENODEV;
>>>>           type = I915_MMAP_TYPE_WC;
>>>>           break;
>>>> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device 
>>>> *dev, void *data,
>>>>           break;
>>>>         case I915_MMAP_OFFSET_UC:
>>>> -        if (!pat_enabled())
>>>> +        if (!x86_has_pat_uc_minus())
>>>>               return -ENODEV;
>>>>           type = I915_MMAP_TYPE_UC;
>>>>           break;
>>>
>>> ... these uses there are several more. You say nothing on why those want
>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>> and came to the conclusion that these all would also better observe the
>>> adjusted behavior (or else I couldn't have left pat_enabled() as the 
>>> only
>>> predicate). In fact, as said in the description of my earlier patch, in
>>> my debugging I did find the use in i915_gem_object_pin_map() to be the
>>> problematic one, which you leave alone.
>>
>> Oh, I missed that one, sorry.
> 
> That is why your patch would not fix my Haswell unless
> it also touches i915_gem_object_pin_map() in
> drivers/gpu/drm/i915/gem/i915_gem_pages.c
> 
>>
>> I wanted to be rather defensive in my changes, but I agree at least the
>> case in arch_phys_wc_add() might want to be changed, too.
> 
> I think your approach needs to be more aggressive so it will fix
> all the known false negatives introduced by bdd8b6c98239
> such as the one in i915_gem_object_pin_map().
> 
> I looked at Jan's approach and I think it would fix the issue
> with my Haswell as long as I don't use the nopat option. I
> really don't have a strong opinion on that question, but I
> think the nopat option as a Linux kernel option, as opposed
> to a hypervisor option, should only affect the kernel, and
> if the hypervisor provides the pat feature, then the kernel
> should not override that,

Hmm, why would the kernel not be allowed to override that? Such
an override would affect only the single domain where the
kernel runs; other domains could take their own decisions.

Also, for the sake of completeness: "nopat" used when running on
bare metal has the same bad effect on system boot, so there
pretty clearly is an error cleanup issue in the i915 driver. But
that's orthogonal, and I expect the maintainers may not even care
(but tell us "don't do that then").

Jan

> but because of the confusion, maybe
> a warning could be printed with the nopat option when a
> hypervisor provides the feature so the user can at least have a
> knob to tweak if if does not behave the way the user intends.
> But I must admit, I don't know if the Xen hypervisor has an
> option also to disable pat. If not, then maybe Jan's more
> aggressive approach with nopat might be needed if for
> some reason pat really needs to be disabled in the Linux
> when Linux is running on Xen or another hypervisor, but I don't
> know of any cases when that would be needed.
> 
> Chuck
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20  6:05           ` Jan Beulich
  0 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-20  6:05 UTC (permalink / raw)
  To: Chuck Zmudzinski
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner

On 20.05.2022 06:43, Chuck Zmudzinski wrote:
> On 5/4/22 5:14 AM, Juergen Gross wrote:
>> On 04.05.22 10:31, Jan Beulich wrote:
>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>> Some drivers are using pat_enabled() in order to test availability of
>>>> special caching modes (WC and UC-). This will lead to false negatives
>>>> in case the system was booted e.g. with the "nopat" variant and the
>>>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>>>> system is running as a Xen PV guest.
>>> ...
>>>> Add test functions for those caching modes instead and use them at the
>>>> appropriate places.
>>>>
>>>> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with 
>>>> pat_enabled()")
>>>> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>> ...
>>>
>>>> --- a/arch/x86/include/asm/pci.h
>>>> +++ b/arch/x86/include/asm/pci.h
>>>> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, 
>>>> int pin, int irq);
>>>>       #define HAVE_PCI_MMAP
>>>> -#define arch_can_pci_mmap_wc()    pat_enabled()
>>>> +#define arch_can_pci_mmap_wc()    x86_has_pat_wc()
>>>
>>> Besides this and ...
>>>
>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>>> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void 
>>>> *data,
>>>>       if (args->flags & ~(I915_MMAP_WC))
>>>>           return -EINVAL;
>>>>   -    if (args->flags & I915_MMAP_WC && !pat_enabled())
>>>> +    if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>>>>           return -ENODEV;
>>>>         obj = i915_gem_object_lookup(file, args->handle);
>>>> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>>>>         if (HAS_LMEM(to_i915(dev)))
>>>>           mmap_type = I915_MMAP_TYPE_FIXED;
>>>> -    else if (pat_enabled())
>>>> +    else if (x86_has_pat_wc())
>>>>           mmap_type = I915_MMAP_TYPE_WC;
>>>>       else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>>>>           return -ENODEV;
>>>> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device 
>>>> *dev, void *data,
>>>>           break;
>>>>         case I915_MMAP_OFFSET_WC:
>>>> -        if (!pat_enabled())
>>>> +        if (!x86_has_pat_wc())
>>>>               return -ENODEV;
>>>>           type = I915_MMAP_TYPE_WC;
>>>>           break;
>>>> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device 
>>>> *dev, void *data,
>>>>           break;
>>>>         case I915_MMAP_OFFSET_UC:
>>>> -        if (!pat_enabled())
>>>> +        if (!x86_has_pat_uc_minus())
>>>>               return -ENODEV;
>>>>           type = I915_MMAP_TYPE_UC;
>>>>           break;
>>>
>>> ... these uses there are several more. You say nothing on why those want
>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>> and came to the conclusion that these all would also better observe the
>>> adjusted behavior (or else I couldn't have left pat_enabled() as the 
>>> only
>>> predicate). In fact, as said in the description of my earlier patch, in
>>> my debugging I did find the use in i915_gem_object_pin_map() to be the
>>> problematic one, which you leave alone.
>>
>> Oh, I missed that one, sorry.
> 
> That is why your patch would not fix my Haswell unless
> it also touches i915_gem_object_pin_map() in
> drivers/gpu/drm/i915/gem/i915_gem_pages.c
> 
>>
>> I wanted to be rather defensive in my changes, but I agree at least the
>> case in arch_phys_wc_add() might want to be changed, too.
> 
> I think your approach needs to be more aggressive so it will fix
> all the known false negatives introduced by bdd8b6c98239
> such as the one in i915_gem_object_pin_map().
> 
> I looked at Jan's approach and I think it would fix the issue
> with my Haswell as long as I don't use the nopat option. I
> really don't have a strong opinion on that question, but I
> think the nopat option as a Linux kernel option, as opposed
> to a hypervisor option, should only affect the kernel, and
> if the hypervisor provides the pat feature, then the kernel
> should not override that,

Hmm, why would the kernel not be allowed to override that? Such
an override would affect only the single domain where the
kernel runs; other domains could take their own decisions.

Also, for the sake of completeness: "nopat" used when running on
bare metal has the same bad effect on system boot, so there
pretty clearly is an error cleanup issue in the i915 driver. But
that's orthogonal, and I expect the maintainers may not even care
(but tell us "don't do that then").

Jan

> but because of the confusion, maybe
> a warning could be printed with the nopat option when a
> hypervisor provides the feature so the user can at least have a
> knob to tweak if if does not behave the way the user intends.
> But I must admit, I don't know if the Xen hypervisor has an
> option also to disable pat. If not, then maybe Jan's more
> aggressive approach with nopat might be needed if for
> some reason pat really needs to be disabled in the Linux
> when Linux is running on Xen or another hypervisor, but I don't
> know of any cases when that would be needed.
> 
> Chuck
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-20  6:05           ` Jan Beulich
@ 2022-05-20  6:59             ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20  6:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross

On 5/20/2022 2:05 AM, Jan Beulich wrote:
> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>> Some drivers are using pat_enabled() in order to test availability of
>>>>> special caching modes (WC and UC-). This will lead to false negatives
>>>>> in case the system was booted e.g. with the "nopat" variant and the
>>>>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>>>>> system is running as a Xen PV guest.
>>>> ...
>>>>> Add test functions for those caching modes instead and use them at the
>>>>> appropriate places.
>>>>>
>>>>> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with
>>>>> pat_enabled()")
>>>>> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> ...
>>>>
>>>>> --- a/arch/x86/include/asm/pci.h
>>>>> +++ b/arch/x86/include/asm/pci.h
>>>>> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev,
>>>>> int pin, int irq);
>>>>>        #define HAVE_PCI_MMAP
>>>>> -#define arch_can_pci_mmap_wc()    pat_enabled()
>>>>> +#define arch_can_pci_mmap_wc()    x86_has_pat_wc()
>>>> Besides this and ...
>>>>
>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>>>> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void
>>>>> *data,
>>>>>        if (args->flags & ~(I915_MMAP_WC))
>>>>>            return -EINVAL;
>>>>>    -    if (args->flags & I915_MMAP_WC && !pat_enabled())
>>>>> +    if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>>>>>            return -ENODEV;
>>>>>          obj = i915_gem_object_lookup(file, args->handle);
>>>>> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>>>>>          if (HAS_LMEM(to_i915(dev)))
>>>>>            mmap_type = I915_MMAP_TYPE_FIXED;
>>>>> -    else if (pat_enabled())
>>>>> +    else if (x86_has_pat_wc())
>>>>>            mmap_type = I915_MMAP_TYPE_WC;
>>>>>        else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>>>>>            return -ENODEV;
>>>>> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device
>>>>> *dev, void *data,
>>>>>            break;
>>>>>          case I915_MMAP_OFFSET_WC:
>>>>> -        if (!pat_enabled())
>>>>> +        if (!x86_has_pat_wc())
>>>>>                return -ENODEV;
>>>>>            type = I915_MMAP_TYPE_WC;
>>>>>            break;
>>>>> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device
>>>>> *dev, void *data,
>>>>>            break;
>>>>>          case I915_MMAP_OFFSET_UC:
>>>>> -        if (!pat_enabled())
>>>>> +        if (!x86_has_pat_uc_minus())
>>>>>                return -ENODEV;
>>>>>            type = I915_MMAP_TYPE_UC;
>>>>>            break;
>>>> ... these uses there are several more. You say nothing on why those want
>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>> and came to the conclusion that these all would also better observe the
>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>> only
>>>> predicate). In fact, as said in the description of my earlier patch, in
>>>> my debugging I did find the use in i915_gem_object_pin_map() to be the
>>>> problematic one, which you leave alone.
>>> Oh, I missed that one, sorry.
>> That is why your patch would not fix my Haswell unless
>> it also touches i915_gem_object_pin_map() in
>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>
>>> I wanted to be rather defensive in my changes, but I agree at least the
>>> case in arch_phys_wc_add() might want to be changed, too.
>> I think your approach needs to be more aggressive so it will fix
>> all the known false negatives introduced by bdd8b6c98239
>> such as the one in i915_gem_object_pin_map().
>>
>> I looked at Jan's approach and I think it would fix the issue
>> with my Haswell as long as I don't use the nopat option. I
>> really don't have a strong opinion on that question, but I
>> think the nopat option as a Linux kernel option, as opposed
>> to a hypervisor option, should only affect the kernel, and
>> if the hypervisor provides the pat feature, then the kernel
>> should not override that,
> Hmm, why would the kernel not be allowed to override that? Such
> an override would affect only the single domain where the
> kernel runs; other domains could take their own decisions.
>
> Also, for the sake of completeness: "nopat" used when running on
> bare metal has the same bad effect on system boot, so there
> pretty clearly is an error cleanup issue in the i915 driver. But
> that's orthogonal, and I expect the maintainers may not even care
> (but tell us "don't do that then").
>
> Jan
>
>> but because of the confusion,

As I just wrote earlier, the confusion is whether or not "nopat"
means the kernel drivers will not use pat even if the firmware
and hypervisor provides it. I think you are correct to
point out that is the way the i915 driver behaved with the nopat
option before bdd8b6c98239 was applied, with the same
bad effects on bare metal as with the hypervisor. I think perhaps
dealing with the nopat option to fix bdd8b6c98239 is a solution in
search of a problem, at least as regards the i915 driver.

The only problem we have, as I see it, is with a false negative
when the nopat option is *not* enabled. But the forced disabling
of pat in Jan's patch when the nopat option is enabled is probably
needed if the goal of the patch is to preserve the same
behavior of the i915 driver that it had before bdd8b6c98239
was applied.

In any case, especially if we do include Jan's aggressive approach
of disabling pat with the nopat option and preserving the same bad
behavior we had with nopat before bdd8b6c98239 was applied, the
i915 driver should log a warning when pat is disabled. Right now,
the driver returns -ENODEV with the problem in
i915_gem_object_pin_map(), but it does not log an error. The only
log message I get now is the add_taint_for_CI in intel_gt_init
which was not very helpful information for debugging
this problem. It was only the starting point of a longer debugging
process because of a lack of error log messages in the i915 driver.

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20  6:59             ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20  6:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner

On 5/20/2022 2:05 AM, Jan Beulich wrote:
> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>> Some drivers are using pat_enabled() in order to test availability of
>>>>> special caching modes (WC and UC-). This will lead to false negatives
>>>>> in case the system was booted e.g. with the "nopat" variant and the
>>>>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>>>>> system is running as a Xen PV guest.
>>>> ...
>>>>> Add test functions for those caching modes instead and use them at the
>>>>> appropriate places.
>>>>>
>>>>> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with
>>>>> pat_enabled()")
>>>>> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> ...
>>>>
>>>>> --- a/arch/x86/include/asm/pci.h
>>>>> +++ b/arch/x86/include/asm/pci.h
>>>>> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev,
>>>>> int pin, int irq);
>>>>>        #define HAVE_PCI_MMAP
>>>>> -#define arch_can_pci_mmap_wc()    pat_enabled()
>>>>> +#define arch_can_pci_mmap_wc()    x86_has_pat_wc()
>>>> Besides this and ...
>>>>
>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>>>> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void
>>>>> *data,
>>>>>        if (args->flags & ~(I915_MMAP_WC))
>>>>>            return -EINVAL;
>>>>>    -    if (args->flags & I915_MMAP_WC && !pat_enabled())
>>>>> +    if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>>>>>            return -ENODEV;
>>>>>          obj = i915_gem_object_lookup(file, args->handle);
>>>>> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>>>>>          if (HAS_LMEM(to_i915(dev)))
>>>>>            mmap_type = I915_MMAP_TYPE_FIXED;
>>>>> -    else if (pat_enabled())
>>>>> +    else if (x86_has_pat_wc())
>>>>>            mmap_type = I915_MMAP_TYPE_WC;
>>>>>        else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>>>>>            return -ENODEV;
>>>>> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device
>>>>> *dev, void *data,
>>>>>            break;
>>>>>          case I915_MMAP_OFFSET_WC:
>>>>> -        if (!pat_enabled())
>>>>> +        if (!x86_has_pat_wc())
>>>>>                return -ENODEV;
>>>>>            type = I915_MMAP_TYPE_WC;
>>>>>            break;
>>>>> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device
>>>>> *dev, void *data,
>>>>>            break;
>>>>>          case I915_MMAP_OFFSET_UC:
>>>>> -        if (!pat_enabled())
>>>>> +        if (!x86_has_pat_uc_minus())
>>>>>                return -ENODEV;
>>>>>            type = I915_MMAP_TYPE_UC;
>>>>>            break;
>>>> ... these uses there are several more. You say nothing on why those want
>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>> and came to the conclusion that these all would also better observe the
>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>> only
>>>> predicate). In fact, as said in the description of my earlier patch, in
>>>> my debugging I did find the use in i915_gem_object_pin_map() to be the
>>>> problematic one, which you leave alone.
>>> Oh, I missed that one, sorry.
>> That is why your patch would not fix my Haswell unless
>> it also touches i915_gem_object_pin_map() in
>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>
>>> I wanted to be rather defensive in my changes, but I agree at least the
>>> case in arch_phys_wc_add() might want to be changed, too.
>> I think your approach needs to be more aggressive so it will fix
>> all the known false negatives introduced by bdd8b6c98239
>> such as the one in i915_gem_object_pin_map().
>>
>> I looked at Jan's approach and I think it would fix the issue
>> with my Haswell as long as I don't use the nopat option. I
>> really don't have a strong opinion on that question, but I
>> think the nopat option as a Linux kernel option, as opposed
>> to a hypervisor option, should only affect the kernel, and
>> if the hypervisor provides the pat feature, then the kernel
>> should not override that,
> Hmm, why would the kernel not be allowed to override that? Such
> an override would affect only the single domain where the
> kernel runs; other domains could take their own decisions.
>
> Also, for the sake of completeness: "nopat" used when running on
> bare metal has the same bad effect on system boot, so there
> pretty clearly is an error cleanup issue in the i915 driver. But
> that's orthogonal, and I expect the maintainers may not even care
> (but tell us "don't do that then").
>
> Jan
>
>> but because of the confusion,

As I just wrote earlier, the confusion is whether or not "nopat"
means the kernel drivers will not use pat even if the firmware
and hypervisor provides it. I think you are correct to
point out that is the way the i915 driver behaved with the nopat
option before bdd8b6c98239 was applied, with the same
bad effects on bare metal as with the hypervisor. I think perhaps
dealing with the nopat option to fix bdd8b6c98239 is a solution in
search of a problem, at least as regards the i915 driver.

The only problem we have, as I see it, is with a false negative
when the nopat option is *not* enabled. But the forced disabling
of pat in Jan's patch when the nopat option is enabled is probably
needed if the goal of the patch is to preserve the same
behavior of the i915 driver that it had before bdd8b6c98239
was applied.

In any case, especially if we do include Jan's aggressive approach
of disabling pat with the nopat option and preserving the same bad
behavior we had with nopat before bdd8b6c98239 was applied, the
i915 driver should log a warning when pat is disabled. Right now,
the driver returns -ENODEV with the problem in
i915_gem_object_pin_map(), but it does not log an error. The only
log message I get now is the add_taint_for_CI in intel_gt_init
which was not very helpful information for debugging
this problem. It was only the starting point of a longer debugging
process because of a lack of error log messages in the i915 driver.

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-20  6:59             ` Chuck Zmudzinski
@ 2022-05-20  8:30               ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20  8:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross

On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>
>>>>> ... these uses there are several more. You say nothing on why 
>>>>> those want
>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>> and came to the conclusion that these all would also better 
>>>>> observe the
>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>> only predicate). In fact, as said in the description of my earlier 
>>>>> patch, in
>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be 
>>>>> the
>>>>> problematic one, which you leave alone.
>>>> Oh, I missed that one, sorry.
>>> That is why your patch would not fix my Haswell unless
>>> it also touches i915_gem_object_pin_map() in
>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>
>>>> I wanted to be rather defensive in my changes, but I agree at least 
>>>> the
>>>> case in arch_phys_wc_add() might want to be changed, too.
>>> I think your approach needs to be more aggressive so it will fix
>>> all the known false negatives introduced by bdd8b6c98239
>>> such as the one in i915_gem_object_pin_map().
>>>
>>> I looked at Jan's approach and I think it would fix the issue
>>> with my Haswell as long as I don't use the nopat option. I
>>> really don't have a strong opinion on that question, but I
>>> think the nopat option as a Linux kernel option, as opposed
>>> to a hypervisor option, should only affect the kernel, and
>>> if the hypervisor provides the pat feature, then the kernel
>>> should not override that,
>> Hmm, why would the kernel not be allowed to override that? Such
>> an override would affect only the single domain where the
>> kernel runs; other domains could take their own decisions.
>>
>> Also, for the sake of completeness: "nopat" used when running on
>> bare metal has the same bad effect on system boot, so there
>> pretty clearly is an error cleanup issue in the i915 driver. But
>> that's orthogonal, and I expect the maintainers may not even care
>> (but tell us "don't do that then").

Actually I just did a test with the last official Debian kernel
build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
applied. In fact, the nopat option does *not* break the i915 driver
in 5.16. That is, with the nopat option, the i915 driver loads
normally on both the bare metal and on the Xen hypervisor.
That means your presumption (and the presumption of
the author of bdd8b6c98239) that the "nopat" option was
being observed by the i915 driver is incorrect. Setting "nopat"
had no effect on my system with Linux 5.16. So after doing these
tests, I am against the aggressive approach of breaking the i915
driver with the "nopat" option because prior to bdd8b6c98239,
nopat did not break the i915 driver. Why break it now?

Prior to bdd8b6c98239, the i915 driver used
static_cpu_has(X86_FEATURE_PAT) to test for the PAT
feature, and apparently this returns true even if nopat
is set, but the new test, pat_enabled(), returns false on
the Xen hypervisor even if nopat is not set. That is
the only problem I see. The question of nopat should
be irrelevant to the i915 driver.

It was unfortunate that the author of bdd8b6c98239
mentioned nopat in the commit message when in fact
nopat was never intended to be used to break the
i915 driver. The i915 driver should ignore the nopat
option and decide what to do based solely on the
capability of the cpu, firmware, and the compiled
options of the Linux kernel. That is how it behaved
before bdd8b6c98239, and that behavior is what needs
to be restored with a patch.

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20  8:30               ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20  8:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner

On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>
>>>>> ... these uses there are several more. You say nothing on why 
>>>>> those want
>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>> and came to the conclusion that these all would also better 
>>>>> observe the
>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>> only predicate). In fact, as said in the description of my earlier 
>>>>> patch, in
>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be 
>>>>> the
>>>>> problematic one, which you leave alone.
>>>> Oh, I missed that one, sorry.
>>> That is why your patch would not fix my Haswell unless
>>> it also touches i915_gem_object_pin_map() in
>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>
>>>> I wanted to be rather defensive in my changes, but I agree at least 
>>>> the
>>>> case in arch_phys_wc_add() might want to be changed, too.
>>> I think your approach needs to be more aggressive so it will fix
>>> all the known false negatives introduced by bdd8b6c98239
>>> such as the one in i915_gem_object_pin_map().
>>>
>>> I looked at Jan's approach and I think it would fix the issue
>>> with my Haswell as long as I don't use the nopat option. I
>>> really don't have a strong opinion on that question, but I
>>> think the nopat option as a Linux kernel option, as opposed
>>> to a hypervisor option, should only affect the kernel, and
>>> if the hypervisor provides the pat feature, then the kernel
>>> should not override that,
>> Hmm, why would the kernel not be allowed to override that? Such
>> an override would affect only the single domain where the
>> kernel runs; other domains could take their own decisions.
>>
>> Also, for the sake of completeness: "nopat" used when running on
>> bare metal has the same bad effect on system boot, so there
>> pretty clearly is an error cleanup issue in the i915 driver. But
>> that's orthogonal, and I expect the maintainers may not even care
>> (but tell us "don't do that then").

Actually I just did a test with the last official Debian kernel
build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
applied. In fact, the nopat option does *not* break the i915 driver
in 5.16. That is, with the nopat option, the i915 driver loads
normally on both the bare metal and on the Xen hypervisor.
That means your presumption (and the presumption of
the author of bdd8b6c98239) that the "nopat" option was
being observed by the i915 driver is incorrect. Setting "nopat"
had no effect on my system with Linux 5.16. So after doing these
tests, I am against the aggressive approach of breaking the i915
driver with the "nopat" option because prior to bdd8b6c98239,
nopat did not break the i915 driver. Why break it now?

Prior to bdd8b6c98239, the i915 driver used
static_cpu_has(X86_FEATURE_PAT) to test for the PAT
feature, and apparently this returns true even if nopat
is set, but the new test, pat_enabled(), returns false on
the Xen hypervisor even if nopat is not set. That is
the only problem I see. The question of nopat should
be irrelevant to the i915 driver.

It was unfortunate that the author of bdd8b6c98239
mentioned nopat in the commit message when in fact
nopat was never intended to be used to break the
i915 driver. The i915 driver should ignore the nopat
option and decide what to do based solely on the
capability of the cpu, firmware, and the compiled
options of the Linux kernel. That is how it behaved
before bdd8b6c98239, and that behavior is what needs
to be restored with a patch.

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-20  8:30               ` Chuck Zmudzinski
@ 2022-05-20  9:41                 ` Jan Beulich
  -1 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-20  9:41 UTC (permalink / raw)
  To: Chuck Zmudzinski
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross

On 20.05.2022 10:30, Chuck Zmudzinski wrote:
> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>
>>>>>> ... these uses there are several more. You say nothing on why 
>>>>>> those want
>>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>>> and came to the conclusion that these all would also better 
>>>>>> observe the
>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>>> only predicate). In fact, as said in the description of my earlier 
>>>>>> patch, in
>>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be 
>>>>>> the
>>>>>> problematic one, which you leave alone.
>>>>> Oh, I missed that one, sorry.
>>>> That is why your patch would not fix my Haswell unless
>>>> it also touches i915_gem_object_pin_map() in
>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>
>>>>> I wanted to be rather defensive in my changes, but I agree at least 
>>>>> the
>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>> I think your approach needs to be more aggressive so it will fix
>>>> all the known false negatives introduced by bdd8b6c98239
>>>> such as the one in i915_gem_object_pin_map().
>>>>
>>>> I looked at Jan's approach and I think it would fix the issue
>>>> with my Haswell as long as I don't use the nopat option. I
>>>> really don't have a strong opinion on that question, but I
>>>> think the nopat option as a Linux kernel option, as opposed
>>>> to a hypervisor option, should only affect the kernel, and
>>>> if the hypervisor provides the pat feature, then the kernel
>>>> should not override that,
>>> Hmm, why would the kernel not be allowed to override that? Such
>>> an override would affect only the single domain where the
>>> kernel runs; other domains could take their own decisions.
>>>
>>> Also, for the sake of completeness: "nopat" used when running on
>>> bare metal has the same bad effect on system boot, so there
>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>> that's orthogonal, and I expect the maintainers may not even care
>>> (but tell us "don't do that then").
> 
> Actually I just did a test with the last official Debian kernel
> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
> applied. In fact, the nopat option does *not* break the i915 driver
> in 5.16. That is, with the nopat option, the i915 driver loads
> normally on both the bare metal and on the Xen hypervisor.
> That means your presumption (and the presumption of
> the author of bdd8b6c98239) that the "nopat" option was
> being observed by the i915 driver is incorrect. Setting "nopat"
> had no effect on my system with Linux 5.16. So after doing these
> tests, I am against the aggressive approach of breaking the i915
> driver with the "nopat" option because prior to bdd8b6c98239,
> nopat did not break the i915 driver. Why break it now?

Because that's, in my understanding, is the purpose of "nopat"
(not breaking the driver of course - that's a driver bug -, but
having an effect on the driver).

Jan


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20  9:41                 ` Jan Beulich
  0 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-20  9:41 UTC (permalink / raw)
  To: Chuck Zmudzinski
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner

On 20.05.2022 10:30, Chuck Zmudzinski wrote:
> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>
>>>>>> ... these uses there are several more. You say nothing on why 
>>>>>> those want
>>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>>> and came to the conclusion that these all would also better 
>>>>>> observe the
>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>>> only predicate). In fact, as said in the description of my earlier 
>>>>>> patch, in
>>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be 
>>>>>> the
>>>>>> problematic one, which you leave alone.
>>>>> Oh, I missed that one, sorry.
>>>> That is why your patch would not fix my Haswell unless
>>>> it also touches i915_gem_object_pin_map() in
>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>
>>>>> I wanted to be rather defensive in my changes, but I agree at least 
>>>>> the
>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>> I think your approach needs to be more aggressive so it will fix
>>>> all the known false negatives introduced by bdd8b6c98239
>>>> such as the one in i915_gem_object_pin_map().
>>>>
>>>> I looked at Jan's approach and I think it would fix the issue
>>>> with my Haswell as long as I don't use the nopat option. I
>>>> really don't have a strong opinion on that question, but I
>>>> think the nopat option as a Linux kernel option, as opposed
>>>> to a hypervisor option, should only affect the kernel, and
>>>> if the hypervisor provides the pat feature, then the kernel
>>>> should not override that,
>>> Hmm, why would the kernel not be allowed to override that? Such
>>> an override would affect only the single domain where the
>>> kernel runs; other domains could take their own decisions.
>>>
>>> Also, for the sake of completeness: "nopat" used when running on
>>> bare metal has the same bad effect on system boot, so there
>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>> that's orthogonal, and I expect the maintainers may not even care
>>> (but tell us "don't do that then").
> 
> Actually I just did a test with the last official Debian kernel
> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
> applied. In fact, the nopat option does *not* break the i915 driver
> in 5.16. That is, with the nopat option, the i915 driver loads
> normally on both the bare metal and on the Xen hypervisor.
> That means your presumption (and the presumption of
> the author of bdd8b6c98239) that the "nopat" option was
> being observed by the i915 driver is incorrect. Setting "nopat"
> had no effect on my system with Linux 5.16. So after doing these
> tests, I am against the aggressive approach of breaking the i915
> driver with the "nopat" option because prior to bdd8b6c98239,
> nopat did not break the i915 driver. Why break it now?

Because that's, in my understanding, is the purpose of "nopat"
(not breaking the driver of course - that's a driver bug -, but
having an effect on the driver).

Jan


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-20  9:41                 ` Jan Beulich
@ 2022-05-20 13:33                   ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20 13:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross

On 5/20/2022 5:41 AM, Jan Beulich wrote:
> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>
>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>> those want
>>>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>>>> and came to the conclusion that these all would also better
>>>>>>> observe the
>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>>>> only predicate). In fact, as said in the description of my earlier
>>>>>>> patch, in
>>>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be
>>>>>>> the
>>>>>>> problematic one, which you leave alone.
>>>>>> Oh, I missed that one, sorry.
>>>>> That is why your patch would not fix my Haswell unless
>>>>> it also touches i915_gem_object_pin_map() in
>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>
>>>>>> I wanted to be rather defensive in my changes, but I agree at least
>>>>>> the
>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>> I think your approach needs to be more aggressive so it will fix
>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>> such as the one in i915_gem_object_pin_map().
>>>>>
>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>> really don't have a strong opinion on that question, but I
>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>> to a hypervisor option, should only affect the kernel, and
>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>> should not override that,
>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>> an override would affect only the single domain where the
>>>> kernel runs; other domains could take their own decisions.
>>>>
>>>> Also, for the sake of completeness: "nopat" used when running on
>>>> bare metal has the same bad effect on system boot, so there
>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>> that's orthogonal, and I expect the maintainers may not even care
>>>> (but tell us "don't do that then").
>> Actually I just did a test with the last official Debian kernel
>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>> applied. In fact, the nopat option does *not* break the i915 driver
>> in 5.16. That is, with the nopat option, the i915 driver loads
>> normally on both the bare metal and on the Xen hypervisor.
>> That means your presumption (and the presumption of
>> the author of bdd8b6c98239) that the "nopat" option was
>> being observed by the i915 driver is incorrect. Setting "nopat"
>> had no effect on my system with Linux 5.16. So after doing these
>> tests, I am against the aggressive approach of breaking the i915
>> driver with the "nopat" option because prior to bdd8b6c98239,
>> nopat did not break the i915 driver. Why break it now?
> Because that's, in my understanding, is the purpose of "nopat"
> (not breaking the driver of course - that's a driver bug -, but
> having an effect on the driver).

I wouldn't call it a driver bug, but an incorrect configuration of the
kernel by the user.  I presume X86_FEATURE_PAT is required by the
i915 driver and therefore the driver should refuse to disable
it if the user requests to disable it and instead warn the user that
the driver did not disable the feature, contrary to what the user
requested with the nopat option.

In any case, my test did not verify that when nopat is set in Linux 5.16,
the thread takes the same code path as when nopat is not set,
so I am not totally sure that the reason nopat does not break the
i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
returns true even when nopat is set. I could test it with a custom
log message in 5.16 if that is necessary.

Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
to return true in 5.16 when the user requests nopat? I think that is
just permitting a bad configuration to break the driver that a
well-written operating system should not allow. The i915 driver
was, in my opinion, correctly ignoring the nopat option in 5.16
because that option is not compatible with the hardware the
i915 driver is trying to initialize and setup at boot time. At least
that is my understanding now, but I will need to test it on 5.16
to be sure I understand it correctly.

Also, AFAICT, your patch would break the driver when the nopat
option is set and only fix the regression introduced by bdd8b6c98239
when nopat is not set on my box, so your patch would
introduce a regression relative to Linux 5.16 and earlier for the
case when nopat is set on my box. I think your point would
be that it is not a regression if it is an incorrect user configuration.
I respond by saying a well-written driver should refuse to honor
the incorrect configuration requested by the user and instead
warn the user that it did not honor the incorrect kernel option.

I am only presuming what your patch would do on my box based
on what I learned about this problem from my debugging. I can
also test your patch on my box to verify that my understanding of
it is correct.

I also have not yet verified Juergen's patch will not fix it, but
I am almost certain it will not unless it is expanded so it also
touches i915_gem_object_pin_map() with the fix. I plan to test
his patch, but expanded so it touches that function also.

I also plan to test your patch with and without nopat and report the
results in the thread where you posted your patch. Hopefully
by tomorrow I will have the results.

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20 13:33                   ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20 13:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner

On 5/20/2022 5:41 AM, Jan Beulich wrote:
> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>
>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>> those want
>>>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>>>> and came to the conclusion that these all would also better
>>>>>>> observe the
>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>>>> only predicate). In fact, as said in the description of my earlier
>>>>>>> patch, in
>>>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be
>>>>>>> the
>>>>>>> problematic one, which you leave alone.
>>>>>> Oh, I missed that one, sorry.
>>>>> That is why your patch would not fix my Haswell unless
>>>>> it also touches i915_gem_object_pin_map() in
>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>
>>>>>> I wanted to be rather defensive in my changes, but I agree at least
>>>>>> the
>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>> I think your approach needs to be more aggressive so it will fix
>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>> such as the one in i915_gem_object_pin_map().
>>>>>
>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>> really don't have a strong opinion on that question, but I
>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>> to a hypervisor option, should only affect the kernel, and
>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>> should not override that,
>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>> an override would affect only the single domain where the
>>>> kernel runs; other domains could take their own decisions.
>>>>
>>>> Also, for the sake of completeness: "nopat" used when running on
>>>> bare metal has the same bad effect on system boot, so there
>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>> that's orthogonal, and I expect the maintainers may not even care
>>>> (but tell us "don't do that then").
>> Actually I just did a test with the last official Debian kernel
>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>> applied. In fact, the nopat option does *not* break the i915 driver
>> in 5.16. That is, with the nopat option, the i915 driver loads
>> normally on both the bare metal and on the Xen hypervisor.
>> That means your presumption (and the presumption of
>> the author of bdd8b6c98239) that the "nopat" option was
>> being observed by the i915 driver is incorrect. Setting "nopat"
>> had no effect on my system with Linux 5.16. So after doing these
>> tests, I am against the aggressive approach of breaking the i915
>> driver with the "nopat" option because prior to bdd8b6c98239,
>> nopat did not break the i915 driver. Why break it now?
> Because that's, in my understanding, is the purpose of "nopat"
> (not breaking the driver of course - that's a driver bug -, but
> having an effect on the driver).

I wouldn't call it a driver bug, but an incorrect configuration of the
kernel by the user.  I presume X86_FEATURE_PAT is required by the
i915 driver and therefore the driver should refuse to disable
it if the user requests to disable it and instead warn the user that
the driver did not disable the feature, contrary to what the user
requested with the nopat option.

In any case, my test did not verify that when nopat is set in Linux 5.16,
the thread takes the same code path as when nopat is not set,
so I am not totally sure that the reason nopat does not break the
i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
returns true even when nopat is set. I could test it with a custom
log message in 5.16 if that is necessary.

Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
to return true in 5.16 when the user requests nopat? I think that is
just permitting a bad configuration to break the driver that a
well-written operating system should not allow. The i915 driver
was, in my opinion, correctly ignoring the nopat option in 5.16
because that option is not compatible with the hardware the
i915 driver is trying to initialize and setup at boot time. At least
that is my understanding now, but I will need to test it on 5.16
to be sure I understand it correctly.

Also, AFAICT, your patch would break the driver when the nopat
option is set and only fix the regression introduced by bdd8b6c98239
when nopat is not set on my box, so your patch would
introduce a regression relative to Linux 5.16 and earlier for the
case when nopat is set on my box. I think your point would
be that it is not a regression if it is an incorrect user configuration.
I respond by saying a well-written driver should refuse to honor
the incorrect configuration requested by the user and instead
warn the user that it did not honor the incorrect kernel option.

I am only presuming what your patch would do on my box based
on what I learned about this problem from my debugging. I can
also test your patch on my box to verify that my understanding of
it is correct.

I also have not yet verified Juergen's patch will not fix it, but
I am almost certain it will not unless it is expanded so it also
touches i915_gem_object_pin_map() with the fix. I plan to test
his patch, but expanded so it touches that function also.

I also plan to test your patch with and without nopat and report the
results in the thread where you posted your patch. Hopefully
by tomorrow I will have the results.

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-20 13:33                   ` Chuck Zmudzinski
@ 2022-05-20 14:06                     ` Jan Beulich
  -1 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-20 14:06 UTC (permalink / raw)
  To: Chuck Zmudzinski
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross

On 20.05.2022 15:33, Chuck Zmudzinski wrote:
> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>
>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>> those want
>>>>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>> observe the
>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>>>>> only predicate). In fact, as said in the description of my earlier
>>>>>>>> patch, in
>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be
>>>>>>>> the
>>>>>>>> problematic one, which you leave alone.
>>>>>>> Oh, I missed that one, sorry.
>>>>>> That is why your patch would not fix my Haswell unless
>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>
>>>>>>> I wanted to be rather defensive in my changes, but I agree at least
>>>>>>> the
>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>
>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>> really don't have a strong opinion on that question, but I
>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>> should not override that,
>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>> an override would affect only the single domain where the
>>>>> kernel runs; other domains could take their own decisions.
>>>>>
>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>> bare metal has the same bad effect on system boot, so there
>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>> (but tell us "don't do that then").
>>> Actually I just did a test with the last official Debian kernel
>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>> applied. In fact, the nopat option does *not* break the i915 driver
>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>> normally on both the bare metal and on the Xen hypervisor.
>>> That means your presumption (and the presumption of
>>> the author of bdd8b6c98239) that the "nopat" option was
>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>> had no effect on my system with Linux 5.16. So after doing these
>>> tests, I am against the aggressive approach of breaking the i915
>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>> nopat did not break the i915 driver. Why break it now?
>> Because that's, in my understanding, is the purpose of "nopat"
>> (not breaking the driver of course - that's a driver bug -, but
>> having an effect on the driver).
> 
> I wouldn't call it a driver bug, but an incorrect configuration of the
> kernel by the user.  I presume X86_FEATURE_PAT is required by the
> i915 driver

The driver ought to work fine without PAT (and hence without being
able to make WC mappings). It would use UC instead and be slow, but
it ought to work.

> and therefore the driver should refuse to disable
> it if the user requests to disable it and instead warn the user that
> the driver did not disable the feature, contrary to what the user
> requested with the nopat option.
> 
> In any case, my test did not verify that when nopat is set in Linux 5.16,
> the thread takes the same code path as when nopat is not set,
> so I am not totally sure that the reason nopat does not break the
> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
> returns true even when nopat is set. I could test it with a custom
> log message in 5.16 if that is necessary.
> 
> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
> to return true in 5.16 when the user requests nopat?

No, I'm not saying that. It was wrong for this construct to be used
in the driver, which was fixed for 5.17 (and which had caused the
regression I did observe, leading to the patch as a hopefully least
bad option).

> I think that is
> just permitting a bad configuration to break the driver that a
> well-written operating system should not allow. The i915 driver
> was, in my opinion, correctly ignoring the nopat option in 5.16
> because that option is not compatible with the hardware the
> i915 driver is trying to initialize and setup at boot time. At least
> that is my understanding now, but I will need to test it on 5.16
> to be sure I understand it correctly.
> 
> Also, AFAICT, your patch would break the driver when the nopat
> option is set and only fix the regression introduced by bdd8b6c98239
> when nopat is not set on my box, so your patch would
> introduce a regression relative to Linux 5.16 and earlier for the
> case when nopat is set on my box. I think your point would
> be that it is not a regression if it is an incorrect user configuration.

Again no - my view is that there's a separate, pre-existing issue
in the driver which was uncovered by the change. This may be a
perceived regression, but is imo different from a real one.

Jan

> I respond by saying a well-written driver should refuse to honor
> the incorrect configuration requested by the user and instead
> warn the user that it did not honor the incorrect kernel option.
> 
> I am only presuming what your patch would do on my box based
> on what I learned about this problem from my debugging. I can
> also test your patch on my box to verify that my understanding of
> it is correct.
> 
> I also have not yet verified Juergen's patch will not fix it, but
> I am almost certain it will not unless it is expanded so it also
> touches i915_gem_object_pin_map() with the fix. I plan to test
> his patch, but expanded so it touches that function also.
> 
> I also plan to test your patch with and without nopat and report the
> results in the thread where you posted your patch. Hopefully
> by tomorrow I will have the results.
> 
> Chuck
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20 14:06                     ` Jan Beulich
  0 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-20 14:06 UTC (permalink / raw)
  To: Chuck Zmudzinski
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner

On 20.05.2022 15:33, Chuck Zmudzinski wrote:
> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>
>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>> those want
>>>>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>> observe the
>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>>>>> only predicate). In fact, as said in the description of my earlier
>>>>>>>> patch, in
>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be
>>>>>>>> the
>>>>>>>> problematic one, which you leave alone.
>>>>>>> Oh, I missed that one, sorry.
>>>>>> That is why your patch would not fix my Haswell unless
>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>
>>>>>>> I wanted to be rather defensive in my changes, but I agree at least
>>>>>>> the
>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>
>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>> really don't have a strong opinion on that question, but I
>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>> should not override that,
>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>> an override would affect only the single domain where the
>>>>> kernel runs; other domains could take their own decisions.
>>>>>
>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>> bare metal has the same bad effect on system boot, so there
>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>> (but tell us "don't do that then").
>>> Actually I just did a test with the last official Debian kernel
>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>> applied. In fact, the nopat option does *not* break the i915 driver
>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>> normally on both the bare metal and on the Xen hypervisor.
>>> That means your presumption (and the presumption of
>>> the author of bdd8b6c98239) that the "nopat" option was
>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>> had no effect on my system with Linux 5.16. So after doing these
>>> tests, I am against the aggressive approach of breaking the i915
>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>> nopat did not break the i915 driver. Why break it now?
>> Because that's, in my understanding, is the purpose of "nopat"
>> (not breaking the driver of course - that's a driver bug -, but
>> having an effect on the driver).
> 
> I wouldn't call it a driver bug, but an incorrect configuration of the
> kernel by the user.  I presume X86_FEATURE_PAT is required by the
> i915 driver

The driver ought to work fine without PAT (and hence without being
able to make WC mappings). It would use UC instead and be slow, but
it ought to work.

> and therefore the driver should refuse to disable
> it if the user requests to disable it and instead warn the user that
> the driver did not disable the feature, contrary to what the user
> requested with the nopat option.
> 
> In any case, my test did not verify that when nopat is set in Linux 5.16,
> the thread takes the same code path as when nopat is not set,
> so I am not totally sure that the reason nopat does not break the
> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
> returns true even when nopat is set. I could test it with a custom
> log message in 5.16 if that is necessary.
> 
> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
> to return true in 5.16 when the user requests nopat?

No, I'm not saying that. It was wrong for this construct to be used
in the driver, which was fixed for 5.17 (and which had caused the
regression I did observe, leading to the patch as a hopefully least
bad option).

> I think that is
> just permitting a bad configuration to break the driver that a
> well-written operating system should not allow. The i915 driver
> was, in my opinion, correctly ignoring the nopat option in 5.16
> because that option is not compatible with the hardware the
> i915 driver is trying to initialize and setup at boot time. At least
> that is my understanding now, but I will need to test it on 5.16
> to be sure I understand it correctly.
> 
> Also, AFAICT, your patch would break the driver when the nopat
> option is set and only fix the regression introduced by bdd8b6c98239
> when nopat is not set on my box, so your patch would
> introduce a regression relative to Linux 5.16 and earlier for the
> case when nopat is set on my box. I think your point would
> be that it is not a regression if it is an incorrect user configuration.

Again no - my view is that there's a separate, pre-existing issue
in the driver which was uncovered by the change. This may be a
perceived regression, but is imo different from a real one.

Jan

> I respond by saying a well-written driver should refuse to honor
> the incorrect configuration requested by the user and instead
> warn the user that it did not honor the incorrect kernel option.
> 
> I am only presuming what your patch would do on my box based
> on what I learned about this problem from my debugging. I can
> also test your patch on my box to verify that my understanding of
> it is correct.
> 
> I also have not yet verified Juergen's patch will not fix it, but
> I am almost certain it will not unless it is expanded so it also
> touches i915_gem_object_pin_map() with the fix. I plan to test
> his patch, but expanded so it touches that function also.
> 
> I also plan to test your patch with and without nopat and report the
> results in the thread where you posted your patch. Hopefully
> by tomorrow I will have the results.
> 
> Chuck
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-20 14:06                     ` Jan Beulich
@ 2022-05-20 14:48                       ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20 14:48 UTC (permalink / raw)
  To: Jan Beulich, regressions, stable
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross

On 5/20/2022 10:06 AM, Jan Beulich wrote:
> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>
>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>> those want
>>>>>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>> observe the
>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>>>>>> only predicate). In fact, as said in the description of my earlier
>>>>>>>>> patch, in
>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be
>>>>>>>>> the
>>>>>>>>> problematic one, which you leave alone.
>>>>>>>> Oh, I missed that one, sorry.
>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>
>>>>>>>> I wanted to be rather defensive in my changes, but I agree at least
>>>>>>>> the
>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>
>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>> should not override that,
>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>> an override would affect only the single domain where the
>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>
>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>> (but tell us "don't do that then").
>>>> Actually I just did a test with the last official Debian kernel
>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>> normally on both the bare metal and on the Xen hypervisor.
>>>> That means your presumption (and the presumption of
>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>> had no effect on my system with Linux 5.16. So after doing these
>>>> tests, I am against the aggressive approach of breaking the i915
>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>> nopat did not break the i915 driver. Why break it now?
>>> Because that's, in my understanding, is the purpose of "nopat"
>>> (not breaking the driver of course - that's a driver bug -, but
>>> having an effect on the driver).
>> I wouldn't call it a driver bug, but an incorrect configuration of the
>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>> i915 driver
> The driver ought to work fine without PAT (and hence without being
> able to make WC mappings). It would use UC instead and be slow, but
> it ought to work.
>
>> and therefore the driver should refuse to disable
>> it if the user requests to disable it and instead warn the user that
>> the driver did not disable the feature, contrary to what the user
>> requested with the nopat option.
>>
>> In any case, my test did not verify that when nopat is set in Linux 5.16,
>> the thread takes the same code path as when nopat is not set,
>> so I am not totally sure that the reason nopat does not break the
>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>> returns true even when nopat is set. I could test it with a custom
>> log message in 5.16 if that is necessary.
>>
>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>> to return true in 5.16 when the user requests nopat?
> No, I'm not saying that. It was wrong for this construct to be used
> in the driver, which was fixed for 5.17 (and which had caused the
> regression I did observe, leading to the patch as a hopefully least
> bad option).
>
>> I think that is
>> just permitting a bad configuration to break the driver that a
>> well-written operating system should not allow. The i915 driver
>> was, in my opinion, correctly ignoring the nopat option in 5.16
>> because that option is not compatible with the hardware the
>> i915 driver is trying to initialize and setup at boot time. At least
>> that is my understanding now, but I will need to test it on 5.16
>> to be sure I understand it correctly.
>>
>> Also, AFAICT, your patch would break the driver when the nopat
>> option is set and only fix the regression introduced by bdd8b6c98239
>> when nopat is not set on my box, so your patch would
>> introduce a regression relative to Linux 5.16 and earlier for the
>> case when nopat is set on my box. I think your point would
>> be that it is not a regression if it is an incorrect user configuration.
> Again no - my view is that there's a separate, pre-existing issue
> in the driver which was uncovered by the change. This may be a
> perceived regression, but is imo different from a real one.
>
> Jan

Since it is a regression, I think for now bdd8b6c98239 should
be reverted and the fix backported to Linux 5.17 stable until
the underlying memory subsystem can provide the i915 driver
with an updated test for the PAT feature that also meets the
requirements of the author of bdd8b6c98239 without breaking
the i915 driver. The i915 driver relies on the memory subsytem
to provide it with an accurate test for the existence of
X86_FEATURE_PAT. I think your patch provides that more accurate
test so that bdd8b6c98239 could be re-applied when your patch is
committed. Juergen's patch would have to touch bdd8b6c98239
with new functions that probably have unknown and unintended
consequences, so I think your approach is also better in that regard.
As regards your patch, there is just a disagreement about how the
i915 driver should behave if nopat is set. I agree the i915 driver
could do a better job handling that case, at least with better error
logs.

Chuck

>
>> I respond by saying a well-written driver should refuse to honor
>> the incorrect configuration requested by the user and instead
>> warn the user that it did not honor the incorrect kernel option.
>>
>> I am only presuming what your patch would do on my box based
>> on what I learned about this problem from my debugging. I can
>> also test your patch on my box to verify that my understanding of
>> it is correct.
>>
>> I also have not yet verified Juergen's patch will not fix it, but
>> I am almost certain it will not unless it is expanded so it also
>> touches i915_gem_object_pin_map() with the fix. I plan to test
>> his patch, but expanded so it touches that function also.
>>
>> I also plan to test your patch with and without nopat and report the
>> results in the thread where you posted your patch. Hopefully
>> by tomorrow I will have the results.
>>
>> Chuck
>>


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20 14:48                       ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20 14:48 UTC (permalink / raw)
  To: Jan Beulich, regressions, stable
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner

On 5/20/2022 10:06 AM, Jan Beulich wrote:
> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>
>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>> those want
>>>>>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>> observe the
>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>>>>>> only predicate). In fact, as said in the description of my earlier
>>>>>>>>> patch, in
>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be
>>>>>>>>> the
>>>>>>>>> problematic one, which you leave alone.
>>>>>>>> Oh, I missed that one, sorry.
>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>
>>>>>>>> I wanted to be rather defensive in my changes, but I agree at least
>>>>>>>> the
>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>
>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>> should not override that,
>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>> an override would affect only the single domain where the
>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>
>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>> (but tell us "don't do that then").
>>>> Actually I just did a test with the last official Debian kernel
>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>> normally on both the bare metal and on the Xen hypervisor.
>>>> That means your presumption (and the presumption of
>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>> had no effect on my system with Linux 5.16. So after doing these
>>>> tests, I am against the aggressive approach of breaking the i915
>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>> nopat did not break the i915 driver. Why break it now?
>>> Because that's, in my understanding, is the purpose of "nopat"
>>> (not breaking the driver of course - that's a driver bug -, but
>>> having an effect on the driver).
>> I wouldn't call it a driver bug, but an incorrect configuration of the
>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>> i915 driver
> The driver ought to work fine without PAT (and hence without being
> able to make WC mappings). It would use UC instead and be slow, but
> it ought to work.
>
>> and therefore the driver should refuse to disable
>> it if the user requests to disable it and instead warn the user that
>> the driver did not disable the feature, contrary to what the user
>> requested with the nopat option.
>>
>> In any case, my test did not verify that when nopat is set in Linux 5.16,
>> the thread takes the same code path as when nopat is not set,
>> so I am not totally sure that the reason nopat does not break the
>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>> returns true even when nopat is set. I could test it with a custom
>> log message in 5.16 if that is necessary.
>>
>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>> to return true in 5.16 when the user requests nopat?
> No, I'm not saying that. It was wrong for this construct to be used
> in the driver, which was fixed for 5.17 (and which had caused the
> regression I did observe, leading to the patch as a hopefully least
> bad option).
>
>> I think that is
>> just permitting a bad configuration to break the driver that a
>> well-written operating system should not allow. The i915 driver
>> was, in my opinion, correctly ignoring the nopat option in 5.16
>> because that option is not compatible with the hardware the
>> i915 driver is trying to initialize and setup at boot time. At least
>> that is my understanding now, but I will need to test it on 5.16
>> to be sure I understand it correctly.
>>
>> Also, AFAICT, your patch would break the driver when the nopat
>> option is set and only fix the regression introduced by bdd8b6c98239
>> when nopat is not set on my box, so your patch would
>> introduce a regression relative to Linux 5.16 and earlier for the
>> case when nopat is set on my box. I think your point would
>> be that it is not a regression if it is an incorrect user configuration.
> Again no - my view is that there's a separate, pre-existing issue
> in the driver which was uncovered by the change. This may be a
> perceived regression, but is imo different from a real one.
>
> Jan

Since it is a regression, I think for now bdd8b6c98239 should
be reverted and the fix backported to Linux 5.17 stable until
the underlying memory subsystem can provide the i915 driver
with an updated test for the PAT feature that also meets the
requirements of the author of bdd8b6c98239 without breaking
the i915 driver. The i915 driver relies on the memory subsytem
to provide it with an accurate test for the existence of
X86_FEATURE_PAT. I think your patch provides that more accurate
test so that bdd8b6c98239 could be re-applied when your patch is
committed. Juergen's patch would have to touch bdd8b6c98239
with new functions that probably have unknown and unintended
consequences, so I think your approach is also better in that regard.
As regards your patch, there is just a disagreement about how the
i915 driver should behave if nopat is set. I agree the i915 driver
could do a better job handling that case, at least with better error
logs.

Chuck

>
>> I respond by saying a well-written driver should refuse to honor
>> the incorrect configuration requested by the user and instead
>> warn the user that it did not honor the incorrect kernel option.
>>
>> I am only presuming what your patch would do on my box based
>> on what I learned about this problem from my debugging. I can
>> also test your patch on my box to verify that my understanding of
>> it is correct.
>>
>> I also have not yet verified Juergen's patch will not fix it, but
>> I am almost certain it will not unless it is expanded so it also
>> touches i915_gem_object_pin_map() with the fix. I plan to test
>> his patch, but expanded so it touches that function also.
>>
>> I also plan to test your patch with and without nopat and report the
>> results in the thread where you posted your patch. Hopefully
>> by tomorrow I will have the results.
>>
>> Chuck
>>


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [REGRESSION} Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-20 14:06                     ` Jan Beulich
@ 2022-05-20 15:46                       ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20 15:46 UTC (permalink / raw)
  To: Jan Beulich, regressions, stable
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross

On 5/20/2022 10:06 AM, Jan Beulich wrote:
> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>
>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>> those want
>>>>>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>> observe the
>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>>>>>> only predicate). In fact, as said in the description of my earlier
>>>>>>>>> patch, in
>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be
>>>>>>>>> the
>>>>>>>>> problematic one, which you leave alone.
>>>>>>>> Oh, I missed that one, sorry.
>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>
>>>>>>>> I wanted to be rather defensive in my changes, but I agree at least
>>>>>>>> the
>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>
>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>> should not override that,
>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>> an override would affect only the single domain where the
>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>
>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>> (but tell us "don't do that then").
>>>> Actually I just did a test with the last official Debian kernel
>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>> normally on both the bare metal and on the Xen hypervisor.
>>>> That means your presumption (and the presumption of
>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>> had no effect on my system with Linux 5.16. So after doing these
>>>> tests, I am against the aggressive approach of breaking the i915
>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>> nopat did not break the i915 driver. Why break it now?
>>> Because that's, in my understanding, is the purpose of "nopat"
>>> (not breaking the driver of course - that's a driver bug -, but
>>> having an effect on the driver).
>> I wouldn't call it a driver bug, but an incorrect configuration of the
>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>> i915 driver
> The driver ought to work fine without PAT (and hence without being
> able to make WC mappings). It would use UC instead and be slow, but
> it ought to work.

I am not an expert, but I think the reason it failed on my box was
because of the requirements of CI. Maybe the driver would fall back
to UC if the add_taint_for_CI function did not halt the entire system
in response to the failed test for PAT when trying to use WC mappings.

>> and therefore the driver should refuse to disable
>> it if the user requests to disable it and instead warn the user that
>> the driver did not disable the feature, contrary to what the user
>> requested with the nopat option.
>>
>> In any case, my test did not verify that when nopat is set in Linux 5.16,
>> the thread takes the same code path as when nopat is not set,
>> so I am not totally sure that the reason nopat does not break the
>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>> returns true even when nopat is set. I could test it with a custom
>> log message in 5.16 if that is necessary.
>>
>> Are you saying it was wrong for
>> to return true in 5.16 when the user requests nopat?
> No, I'm not saying that. It was wrong for this construct to be used
> in the driver, which was fixed for 5.17 (and which had caused the
> regression I did observe, leading to the patch as a hopefully least
> bad option).

Hmm, the patch I used to fix my box with 5.17.6 used
static_cpu_has(X86_FEATURE_PAT) so the driver could
continue to configure the hardware using WC. This is the
relevant part of the patch I used to fix my box, which includes
extra error logs, (against Debian's official build of 5.17.6):

--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c    2022-05-09 
03:16:33.000000000 -0400
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c    2022-05-19 
15:55:40.339778818 -0400
...
@@ -430,17 +434,23 @@
          err = i915_gem_object_wait_moving_fence(obj, true);
          if (err) {
              ptr = ERR_PTR(err);
+            DRM_ERROR("i915_gem_object_wait_moving_fence error, err = 
%d\n", err);
              goto err_unpin;
          }

-        if (GEM_WARN_ON(type == I915_MAP_WC && !pat_enabled()))
+        if (GEM_WARN_ON(type == I915_MAP_WC &&
+                !pat_enabled() && !static_cpu_has(X86_FEATURE_PAT))) {
+            DRM_ERROR("type == I915_MAP_WC && !pat_enabled(), err = 
%d\n", -ENODEV);
              ptr = ERR_PTR(-ENODEV);
+        }
          else if (i915_gem_object_has_struct_page(obj))
              ptr = i915_gem_object_map_page(obj, type);
          else
              ptr = i915_gem_object_map_pfn(obj, type);
-        if (IS_ERR(ptr))
+        if (IS_ERR(ptr)) {
+            DRM_ERROR("IS_ERR(PTR) is true, returning a (ptr) error\n");
              goto err_unpin;
+        }

          obj->mm.mapping = page_pack_bits(ptr, type);
      }

As you can see, adding the static_cpu_has(X86_FEATURE_PAT)
function to the test for PAT restored the behavior of 5.16 on the
Xen hypervisor to 5.17, and that is how I discovered the solution
to this problem on 5.17 on my box.

>> I think that is
>> just permitting a bad configuration to break the driver that a
>> well-written operating system should not allow. The i915 driver
>> was, in my opinion, correctly ignoring the nopat option in 5.16
>> because that option is not compatible with the hardware the
>> i915 driver is trying to initialize and setup at boot time. At least
>> that is my understanding now, but I will need to test it on 5.16
>> to be sure I understand it correctly.
>>
>> Also, AFAICT, your patch would break the driver when the nopat
>> option is set and only fix the regression introduced by bdd8b6c98239
>> when nopat is not set on my box, so your patch would
>> introduce a regression relative to Linux 5.16 and earlier for the
>> case when nopat is set on my box. I think your point would
>> be that it is not a regression if it is an incorrect user configuration.
> Again no - my view is that there's a separate, pre-existing issue
> in the driver which was uncovered by the change. This may be a
> perceived regression, but is imo different from a real one.

Maybe it is only a perceived regression if nopat is set, but
imo bdd8b6c98239 introduced a real regression in 5.17
relative to 5.16 for the correctly and identically configured
case when the nopat option is not set. That is why I still think
it should be reverted and the fix backported to 5.17 until the
regression for the case when nopat is not set is fixed. As I
said before, the i915 driver relies on the memory subsyste
to provide it with an accurate test for the x86 pat feature.
The test the driver used in bdd8b6c98239 gives the i915 driver
a false negative, and that caused a real regression when nopat
is not set. bdd8b6c98239 can be re-applied if we apply your
patch which corrects the false negative that pat_enabled() is
currently providing the i915 driver with. That false negative
from pat_enabled() is not an i915 bug, it is a bug in x86/pat.

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [REGRESSION} Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20 15:46                       ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20 15:46 UTC (permalink / raw)
  To: Jan Beulich, regressions, stable
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner

On 5/20/2022 10:06 AM, Jan Beulich wrote:
> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>
>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>> those want
>>>>>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>> observe the
>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>>>>>> only predicate). In fact, as said in the description of my earlier
>>>>>>>>> patch, in
>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be
>>>>>>>>> the
>>>>>>>>> problematic one, which you leave alone.
>>>>>>>> Oh, I missed that one, sorry.
>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>
>>>>>>>> I wanted to be rather defensive in my changes, but I agree at least
>>>>>>>> the
>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>
>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>> should not override that,
>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>> an override would affect only the single domain where the
>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>
>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>> (but tell us "don't do that then").
>>>> Actually I just did a test with the last official Debian kernel
>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>> normally on both the bare metal and on the Xen hypervisor.
>>>> That means your presumption (and the presumption of
>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>> had no effect on my system with Linux 5.16. So after doing these
>>>> tests, I am against the aggressive approach of breaking the i915
>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>> nopat did not break the i915 driver. Why break it now?
>>> Because that's, in my understanding, is the purpose of "nopat"
>>> (not breaking the driver of course - that's a driver bug -, but
>>> having an effect on the driver).
>> I wouldn't call it a driver bug, but an incorrect configuration of the
>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>> i915 driver
> The driver ought to work fine without PAT (and hence without being
> able to make WC mappings). It would use UC instead and be slow, but
> it ought to work.

I am not an expert, but I think the reason it failed on my box was
because of the requirements of CI. Maybe the driver would fall back
to UC if the add_taint_for_CI function did not halt the entire system
in response to the failed test for PAT when trying to use WC mappings.

>> and therefore the driver should refuse to disable
>> it if the user requests to disable it and instead warn the user that
>> the driver did not disable the feature, contrary to what the user
>> requested with the nopat option.
>>
>> In any case, my test did not verify that when nopat is set in Linux 5.16,
>> the thread takes the same code path as when nopat is not set,
>> so I am not totally sure that the reason nopat does not break the
>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>> returns true even when nopat is set. I could test it with a custom
>> log message in 5.16 if that is necessary.
>>
>> Are you saying it was wrong for
>> to return true in 5.16 when the user requests nopat?
> No, I'm not saying that. It was wrong for this construct to be used
> in the driver, which was fixed for 5.17 (and which had caused the
> regression I did observe, leading to the patch as a hopefully least
> bad option).

Hmm, the patch I used to fix my box with 5.17.6 used
static_cpu_has(X86_FEATURE_PAT) so the driver could
continue to configure the hardware using WC. This is the
relevant part of the patch I used to fix my box, which includes
extra error logs, (against Debian's official build of 5.17.6):

--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c    2022-05-09 
03:16:33.000000000 -0400
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c    2022-05-19 
15:55:40.339778818 -0400
...
@@ -430,17 +434,23 @@
          err = i915_gem_object_wait_moving_fence(obj, true);
          if (err) {
              ptr = ERR_PTR(err);
+            DRM_ERROR("i915_gem_object_wait_moving_fence error, err = 
%d\n", err);
              goto err_unpin;
          }

-        if (GEM_WARN_ON(type == I915_MAP_WC && !pat_enabled()))
+        if (GEM_WARN_ON(type == I915_MAP_WC &&
+                !pat_enabled() && !static_cpu_has(X86_FEATURE_PAT))) {
+            DRM_ERROR("type == I915_MAP_WC && !pat_enabled(), err = 
%d\n", -ENODEV);
              ptr = ERR_PTR(-ENODEV);
+        }
          else if (i915_gem_object_has_struct_page(obj))
              ptr = i915_gem_object_map_page(obj, type);
          else
              ptr = i915_gem_object_map_pfn(obj, type);
-        if (IS_ERR(ptr))
+        if (IS_ERR(ptr)) {
+            DRM_ERROR("IS_ERR(PTR) is true, returning a (ptr) error\n");
              goto err_unpin;
+        }

          obj->mm.mapping = page_pack_bits(ptr, type);
      }

As you can see, adding the static_cpu_has(X86_FEATURE_PAT)
function to the test for PAT restored the behavior of 5.16 on the
Xen hypervisor to 5.17, and that is how I discovered the solution
to this problem on 5.17 on my box.

>> I think that is
>> just permitting a bad configuration to break the driver that a
>> well-written operating system should not allow. The i915 driver
>> was, in my opinion, correctly ignoring the nopat option in 5.16
>> because that option is not compatible with the hardware the
>> i915 driver is trying to initialize and setup at boot time. At least
>> that is my understanding now, but I will need to test it on 5.16
>> to be sure I understand it correctly.
>>
>> Also, AFAICT, your patch would break the driver when the nopat
>> option is set and only fix the regression introduced by bdd8b6c98239
>> when nopat is not set on my box, so your patch would
>> introduce a regression relative to Linux 5.16 and earlier for the
>> case when nopat is set on my box. I think your point would
>> be that it is not a regression if it is an incorrect user configuration.
> Again no - my view is that there's a separate, pre-existing issue
> in the driver which was uncovered by the change. This may be a
> perceived regression, but is imo different from a real one.

Maybe it is only a perceived regression if nopat is set, but
imo bdd8b6c98239 introduced a real regression in 5.17
relative to 5.16 for the correctly and identically configured
case when the nopat option is not set. That is why I still think
it should be reverted and the fix backported to 5.17 until the
regression for the case when nopat is not set is fixed. As I
said before, the i915 driver relies on the memory subsyste
to provide it with an accurate test for the x86 pat feature.
The test the driver used in bdd8b6c98239 gives the i915 driver
a false negative, and that caused a real regression when nopat
is not set. bdd8b6c98239 can be re-applied if we apply your
patch which corrects the false negative that pat_enabled() is
currently providing the i915 driver with. That false negative
from pat_enabled() is not an i915 bug, it is a bug in x86/pat.

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [REGRESSION} Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-20 15:46                       ` Chuck Zmudzinski
@ 2022-05-20 17:13                         ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20 17:13 UTC (permalink / raw)
  To: Jan Beulich, regressions, stable
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross

I think this summary of the regression is appropriate for a top-post. 
Details follow below.

commit bdd8b6c98239: introduced what I call a real regression which 
persists in 5.17.x

Jan's proposed patch: 
https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/

Jan's patch would fix the real regression introduced by bdd8b6c98239 when
the nopat option is not enabled, but when the nopat option is enabled, this
patch would introduce what Jan calls a "perceived regression" that is really
caused by the failure of the i915 driver to handle the case of the nopat 
option
being provided on the command line properly.

What I request: commit Jan's proposed patch, and backport it to 5.17. That
would fix the real regression and only cause a perceived regression for the
case when nopat is enabled. In that case, patches to the i915 driver
would be helpful but necessary to fix a regression.

Regard,

Chuck Zmudzinski

On 5/20/2022 11:46 AM, Chuck Zmudzinski wrote:
> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>
>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>> those want
>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did 
>>>>>>>>>> inspect them
>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>> observe the
>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() 
>>>>>>>>>> as the
>>>>>>>>>> only predicate). In fact, as said in the description of my 
>>>>>>>>>> earlier
>>>>>>>>>> patch, in
>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map() 
>>>>>>>>>> to be
>>>>>>>>>> the
>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>
>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at 
>>>>>>>>> least
>>>>>>>>> the
>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>
>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>> should not override that,
>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>> an override would affect only the single domain where the
>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>
>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>> (but tell us "don't do that then").
>>>>> Actually I just did a test with the last official Debian kernel
>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>> That means your presumption (and the presumption of
>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>> nopat did not break the i915 driver. Why break it now?
>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>> (not breaking the driver of course - that's a driver bug -, but
>>>> having an effect on the driver).
>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>> i915 driver
>> The driver ought to work fine without PAT (and hence without being
>> able to make WC mappings). It would use UC instead and be slow, but
>> it ought to work.
>
> I am not an expert, but I think the reason it failed on my box was
> because of the requirements of CI. Maybe the driver would fall back
> to UC if the add_taint_for_CI function did not halt the entire system
> in response to the failed test for PAT when trying to use WC mappings.
>
>>> and therefore the driver should refuse to disable
>>> it if the user requests to disable it and instead warn the user that
>>> the driver did not disable the feature, contrary to what the user
>>> requested with the nopat option.
>>>
>>> In any case, my test did not verify that when nopat is set in Linux 
>>> 5.16,
>>> the thread takes the same code path as when nopat is not set,
>>> so I am not totally sure that the reason nopat does not break the
>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>> returns true even when nopat is set. I could test it with a custom
>>> log message in 5.16 if that is necessary.
>>>
>>> Are you saying it was wrong for
>>> to return true in 5.16 when the user requests nopat?
>> No, I'm not saying that. It was wrong for this construct to be used
>> in the driver, which was fixed for 5.17 (and which had caused the
>> regression I did observe, leading to the patch as a hopefully least
>> bad option).
>
> Hmm, the patch I used to fix my box with 5.17.6 used
> static_cpu_has(X86_FEATURE_PAT) so the driver could
> continue to configure the hardware using WC. This is the
> relevant part of the patch I used to fix my box, which includes
> extra error logs, (against Debian's official build of 5.17.6):
>
> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c    2022-05-09 
> 03:16:33.000000000 -0400
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c    2022-05-19 
> 15:55:40.339778818 -0400
> ...
> @@ -430,17 +434,23 @@
>          err = i915_gem_object_wait_moving_fence(obj, true);
>          if (err) {
>              ptr = ERR_PTR(err);
> +            DRM_ERROR("i915_gem_object_wait_moving_fence error, err = 
> %d\n", err);
>              goto err_unpin;
>          }
>
> -        if (GEM_WARN_ON(type == I915_MAP_WC && !pat_enabled()))
> +        if (GEM_WARN_ON(type == I915_MAP_WC &&
> +                !pat_enabled() && !static_cpu_has(X86_FEATURE_PAT))) {
> +            DRM_ERROR("type == I915_MAP_WC && !pat_enabled(), err = 
> %d\n", -ENODEV);
>              ptr = ERR_PTR(-ENODEV);
> +        }
>          else if (i915_gem_object_has_struct_page(obj))
>              ptr = i915_gem_object_map_page(obj, type);
>          else
>              ptr = i915_gem_object_map_pfn(obj, type);
> -        if (IS_ERR(ptr))
> +        if (IS_ERR(ptr)) {
> +            DRM_ERROR("IS_ERR(PTR) is true, returning a (ptr) error\n");
>              goto err_unpin;
> +        }
>
>          obj->mm.mapping = page_pack_bits(ptr, type);
>      }
>
> As you can see, adding the static_cpu_has(X86_FEATURE_PAT)
> function to the test for PAT restored the behavior of 5.16 on the
> Xen hypervisor to 5.17, and that is how I discovered the solution
> to this problem on 5.17 on my box.
>
>>> I think that is
>>> just permitting a bad configuration to break the driver that a
>>> well-written operating system should not allow. The i915 driver
>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>> because that option is not compatible with the hardware the
>>> i915 driver is trying to initialize and setup at boot time. At least
>>> that is my understanding now, but I will need to test it on 5.16
>>> to be sure I understand it correctly.
>>>
>>> Also, AFAICT, your patch would break the driver when the nopat
>>> option is set and only fix the regression introduced by bdd8b6c98239
>>> when nopat is not set on my box, so your patch would
>>> introduce a regression relative to Linux 5.16 and earlier for the
>>> case when nopat is set on my box. I think your point would
>>> be that it is not a regression if it is an incorrect user 
>>> configuration.
>> Again no - my view is that there's a separate, pre-existing issue
>> in the driver which was uncovered by the change. This may be a
>> perceived regression, but is imo different from a real one.
>
> Maybe it is only a perceived regression if nopat is set, but
> imo bdd8b6c98239 introduced a real regression in 5.17
> relative to 5.16 for the correctly and identically configured
> case when the nopat option is not set. That is why I still think
> it should be reverted and the fix backported to 5.17 until the
> regression for the case when nopat is not set is fixed. As I
> said before, the i915 driver relies on the memory subsyste
> to provide it with an accurate test for the x86 pat feature.
> The test the driver used in bdd8b6c98239 gives the i915 driver
> a false negative, and that caused a real regression when nopat
> is not set. bdd8b6c98239 can be re-applied if we apply your
> patch which corrects the false negative that pat_enabled() is
> currently providing the i915 driver with. That false negative
> from pat_enabled() is not an i915 bug, it is a bug in x86/pat.
>
> Chuck


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [REGRESSION} Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20 17:13                         ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20 17:13 UTC (permalink / raw)
  To: Jan Beulich, regressions, stable
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner

I think this summary of the regression is appropriate for a top-post. 
Details follow below.

commit bdd8b6c98239: introduced what I call a real regression which 
persists in 5.17.x

Jan's proposed patch: 
https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/

Jan's patch would fix the real regression introduced by bdd8b6c98239 when
the nopat option is not enabled, but when the nopat option is enabled, this
patch would introduce what Jan calls a "perceived regression" that is really
caused by the failure of the i915 driver to handle the case of the nopat 
option
being provided on the command line properly.

What I request: commit Jan's proposed patch, and backport it to 5.17. That
would fix the real regression and only cause a perceived regression for the
case when nopat is enabled. In that case, patches to the i915 driver
would be helpful but necessary to fix a regression.

Regard,

Chuck Zmudzinski

On 5/20/2022 11:46 AM, Chuck Zmudzinski wrote:
> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>
>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>> those want
>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did 
>>>>>>>>>> inspect them
>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>> observe the
>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() 
>>>>>>>>>> as the
>>>>>>>>>> only predicate). In fact, as said in the description of my 
>>>>>>>>>> earlier
>>>>>>>>>> patch, in
>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map() 
>>>>>>>>>> to be
>>>>>>>>>> the
>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>
>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at 
>>>>>>>>> least
>>>>>>>>> the
>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>
>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>> should not override that,
>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>> an override would affect only the single domain where the
>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>
>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>> (but tell us "don't do that then").
>>>>> Actually I just did a test with the last official Debian kernel
>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>> That means your presumption (and the presumption of
>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>> nopat did not break the i915 driver. Why break it now?
>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>> (not breaking the driver of course - that's a driver bug -, but
>>>> having an effect on the driver).
>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>> i915 driver
>> The driver ought to work fine without PAT (and hence without being
>> able to make WC mappings). It would use UC instead and be slow, but
>> it ought to work.
>
> I am not an expert, but I think the reason it failed on my box was
> because of the requirements of CI. Maybe the driver would fall back
> to UC if the add_taint_for_CI function did not halt the entire system
> in response to the failed test for PAT when trying to use WC mappings.
>
>>> and therefore the driver should refuse to disable
>>> it if the user requests to disable it and instead warn the user that
>>> the driver did not disable the feature, contrary to what the user
>>> requested with the nopat option.
>>>
>>> In any case, my test did not verify that when nopat is set in Linux 
>>> 5.16,
>>> the thread takes the same code path as when nopat is not set,
>>> so I am not totally sure that the reason nopat does not break the
>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>> returns true even when nopat is set. I could test it with a custom
>>> log message in 5.16 if that is necessary.
>>>
>>> Are you saying it was wrong for
>>> to return true in 5.16 when the user requests nopat?
>> No, I'm not saying that. It was wrong for this construct to be used
>> in the driver, which was fixed for 5.17 (and which had caused the
>> regression I did observe, leading to the patch as a hopefully least
>> bad option).
>
> Hmm, the patch I used to fix my box with 5.17.6 used
> static_cpu_has(X86_FEATURE_PAT) so the driver could
> continue to configure the hardware using WC. This is the
> relevant part of the patch I used to fix my box, which includes
> extra error logs, (against Debian's official build of 5.17.6):
>
> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c    2022-05-09 
> 03:16:33.000000000 -0400
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c    2022-05-19 
> 15:55:40.339778818 -0400
> ...
> @@ -430,17 +434,23 @@
>          err = i915_gem_object_wait_moving_fence(obj, true);
>          if (err) {
>              ptr = ERR_PTR(err);
> +            DRM_ERROR("i915_gem_object_wait_moving_fence error, err = 
> %d\n", err);
>              goto err_unpin;
>          }
>
> -        if (GEM_WARN_ON(type == I915_MAP_WC && !pat_enabled()))
> +        if (GEM_WARN_ON(type == I915_MAP_WC &&
> +                !pat_enabled() && !static_cpu_has(X86_FEATURE_PAT))) {
> +            DRM_ERROR("type == I915_MAP_WC && !pat_enabled(), err = 
> %d\n", -ENODEV);
>              ptr = ERR_PTR(-ENODEV);
> +        }
>          else if (i915_gem_object_has_struct_page(obj))
>              ptr = i915_gem_object_map_page(obj, type);
>          else
>              ptr = i915_gem_object_map_pfn(obj, type);
> -        if (IS_ERR(ptr))
> +        if (IS_ERR(ptr)) {
> +            DRM_ERROR("IS_ERR(PTR) is true, returning a (ptr) error\n");
>              goto err_unpin;
> +        }
>
>          obj->mm.mapping = page_pack_bits(ptr, type);
>      }
>
> As you can see, adding the static_cpu_has(X86_FEATURE_PAT)
> function to the test for PAT restored the behavior of 5.16 on the
> Xen hypervisor to 5.17, and that is how I discovered the solution
> to this problem on 5.17 on my box.
>
>>> I think that is
>>> just permitting a bad configuration to break the driver that a
>>> well-written operating system should not allow. The i915 driver
>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>> because that option is not compatible with the hardware the
>>> i915 driver is trying to initialize and setup at boot time. At least
>>> that is my understanding now, but I will need to test it on 5.16
>>> to be sure I understand it correctly.
>>>
>>> Also, AFAICT, your patch would break the driver when the nopat
>>> option is set and only fix the regression introduced by bdd8b6c98239
>>> when nopat is not set on my box, so your patch would
>>> introduce a regression relative to Linux 5.16 and earlier for the
>>> case when nopat is set on my box. I think your point would
>>> be that it is not a regression if it is an incorrect user 
>>> configuration.
>> Again no - my view is that there's a separate, pre-existing issue
>> in the driver which was uncovered by the change. This may be a
>> perceived regression, but is imo different from a real one.
>
> Maybe it is only a perceived regression if nopat is set, but
> imo bdd8b6c98239 introduced a real regression in 5.17
> relative to 5.16 for the correctly and identically configured
> case when the nopat option is not set. That is why I still think
> it should be reverted and the fix backported to 5.17 until the
> regression for the case when nopat is not set is fixed. As I
> said before, the i915 driver relies on the memory subsyste
> to provide it with an accurate test for the x86 pat feature.
> The test the driver used in bdd8b6c98239 gives the i915 driver
> a false negative, and that caused a real regression when nopat
> is not set. bdd8b6c98239 can be re-applied if we apply your
> patch which corrects the false negative that pat_enabled() is
> currently providing the i915 driver with. That false negative
> from pat_enabled() is not an i915 bug, it is a bug in x86/pat.
>
> Chuck


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [REGRESSION} Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-20 17:13                         ` Chuck Zmudzinski
@ 2022-05-20 17:17                           ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20 17:17 UTC (permalink / raw)
  To: Jan Beulich, regressions, stable
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross

On 5/20/2022 1:13 PM, Chuck Zmudzinski wrote:
> I think this summary of the regression is appropriate for a top-post. 
> Details follow below.
>
> commit bdd8b6c98239: introduced what I call a real regression which 
> persists in 5.17.x
>
> Jan's proposed patch: 
> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/
>
> Jan's patch would fix the real regression introduced by bdd8b6c98239 when
> the nopat option is not enabled, but when the nopat option is enabled, 
> this
> patch would introduce what Jan calls a "perceived regression" that is 
> really
> caused by the failure of the i915 driver to handle the case of the 
> nopat option
> being provided on the command line properly.
>
> What I request: commit Jan's proposed patch, and backport it to 5.17. 
> That
> would fix the real regression and only cause a perceived regression 
> for the
> case when nopat is enabled. In that case, patches to the i915 driver
> would be helpful but necessary to fix a regression.

Sorry again, I mean patches to i915 would be helpful but *not* necessary
to fix a regression.

Regards,

Chuck Zmudzinski

>
> On 5/20/2022 11:46 AM, Chuck Zmudzinski wrote:
>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>
>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>> those want
>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did 
>>>>>>>>>>> inspect them
>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>> observe the
>>>>>>>>>>> adjusted behavior (or else I couldn't have left 
>>>>>>>>>>> pat_enabled() as the
>>>>>>>>>>> only predicate). In fact, as said in the description of my 
>>>>>>>>>>> earlier
>>>>>>>>>>> patch, in
>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map() 
>>>>>>>>>>> to be
>>>>>>>>>>> the
>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>
>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at 
>>>>>>>>>> least
>>>>>>>>>> the
>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>
>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>> should not override that,
>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>> an override would affect only the single domain where the
>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>
>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>> (but tell us "don't do that then").
>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>> That means your presumption (and the presumption of
>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>> having an effect on the driver).
>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>> i915 driver
>>> The driver ought to work fine without PAT (and hence without being
>>> able to make WC mappings). It would use UC instead and be slow, but
>>> it ought to work.
>>
>> I am not an expert, but I think the reason it failed on my box was
>> because of the requirements of CI. Maybe the driver would fall back
>> to UC if the add_taint_for_CI function did not halt the entire system
>> in response to the failed test for PAT when trying to use WC mappings.
>>
>>>> and therefore the driver should refuse to disable
>>>> it if the user requests to disable it and instead warn the user that
>>>> the driver did not disable the feature, contrary to what the user
>>>> requested with the nopat option.
>>>>
>>>> In any case, my test did not verify that when nopat is set in Linux 
>>>> 5.16,
>>>> the thread takes the same code path as when nopat is not set,
>>>> so I am not totally sure that the reason nopat does not break the
>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>> returns true even when nopat is set. I could test it with a custom
>>>> log message in 5.16 if that is necessary.
>>>>
>>>> Are you saying it was wrong for
>>>> to return true in 5.16 when the user requests nopat?
>>> No, I'm not saying that. It was wrong for this construct to be used
>>> in the driver, which was fixed for 5.17 (and which had caused the
>>> regression I did observe, leading to the patch as a hopefully least
>>> bad option).
>>
>> Hmm, the patch I used to fix my box with 5.17.6 used
>> static_cpu_has(X86_FEATURE_PAT) so the driver could
>> continue to configure the hardware using WC. This is the
>> relevant part of the patch I used to fix my box, which includes
>> extra error logs, (against Debian's official build of 5.17.6):
>>
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c    2022-05-09 
>> 03:16:33.000000000 -0400
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c    2022-05-19 
>> 15:55:40.339778818 -0400
>> ...
>> @@ -430,17 +434,23 @@
>>          err = i915_gem_object_wait_moving_fence(obj, true);
>>          if (err) {
>>              ptr = ERR_PTR(err);
>> +            DRM_ERROR("i915_gem_object_wait_moving_fence error, err 
>> = %d\n", err);
>>              goto err_unpin;
>>          }
>>
>> -        if (GEM_WARN_ON(type == I915_MAP_WC && !pat_enabled()))
>> +        if (GEM_WARN_ON(type == I915_MAP_WC &&
>> +                !pat_enabled() && !static_cpu_has(X86_FEATURE_PAT))) {
>> +            DRM_ERROR("type == I915_MAP_WC && !pat_enabled(), err = 
>> %d\n", -ENODEV);
>>              ptr = ERR_PTR(-ENODEV);
>> +        }
>>          else if (i915_gem_object_has_struct_page(obj))
>>              ptr = i915_gem_object_map_page(obj, type);
>>          else
>>              ptr = i915_gem_object_map_pfn(obj, type);
>> -        if (IS_ERR(ptr))
>> +        if (IS_ERR(ptr)) {
>> +            DRM_ERROR("IS_ERR(PTR) is true, returning a (ptr) 
>> error\n");
>>              goto err_unpin;
>> +        }
>>
>>          obj->mm.mapping = page_pack_bits(ptr, type);
>>      }
>>
>> As you can see, adding the static_cpu_has(X86_FEATURE_PAT)
>> function to the test for PAT restored the behavior of 5.16 on the
>> Xen hypervisor to 5.17, and that is how I discovered the solution
>> to this problem on 5.17 on my box.
>>
>>>> I think that is
>>>> just permitting a bad configuration to break the driver that a
>>>> well-written operating system should not allow. The i915 driver
>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>> because that option is not compatible with the hardware the
>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>> that is my understanding now, but I will need to test it on 5.16
>>>> to be sure I understand it correctly.
>>>>
>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>> when nopat is not set on my box, so your patch would
>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>> case when nopat is set on my box. I think your point would
>>>> be that it is not a regression if it is an incorrect user 
>>>> configuration.
>>> Again no - my view is that there's a separate, pre-existing issue
>>> in the driver which was uncovered by the change. This may be a
>>> perceived regression, but is imo different from a real one.
>>
>> Maybe it is only a perceived regression if nopat is set, but
>> imo bdd8b6c98239 introduced a real regression in 5.17
>> relative to 5.16 for the correctly and identically configured
>> case when the nopat option is not set. That is why I still think
>> it should be reverted and the fix backported to 5.17 until the
>> regression for the case when nopat is not set is fixed. As I
>> said before, the i915 driver relies on the memory subsyste
>> to provide it with an accurate test for the x86 pat feature.
>> The test the driver used in bdd8b6c98239 gives the i915 driver
>> a false negative, and that caused a real regression when nopat
>> is not set. bdd8b6c98239 can be re-applied if we apply your
>> patch which corrects the false negative that pat_enabled() is
>> currently providing the i915 driver with. That false negative
>> from pat_enabled() is not an i915 bug, it is a bug in x86/pat.
>>
>> Chuck
>


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [REGRESSION} Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-20 17:17                           ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-20 17:17 UTC (permalink / raw)
  To: Jan Beulich, regressions, stable
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner

On 5/20/2022 1:13 PM, Chuck Zmudzinski wrote:
> I think this summary of the regression is appropriate for a top-post. 
> Details follow below.
>
> commit bdd8b6c98239: introduced what I call a real regression which 
> persists in 5.17.x
>
> Jan's proposed patch: 
> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/
>
> Jan's patch would fix the real regression introduced by bdd8b6c98239 when
> the nopat option is not enabled, but when the nopat option is enabled, 
> this
> patch would introduce what Jan calls a "perceived regression" that is 
> really
> caused by the failure of the i915 driver to handle the case of the 
> nopat option
> being provided on the command line properly.
>
> What I request: commit Jan's proposed patch, and backport it to 5.17. 
> That
> would fix the real regression and only cause a perceived regression 
> for the
> case when nopat is enabled. In that case, patches to the i915 driver
> would be helpful but necessary to fix a regression.

Sorry again, I mean patches to i915 would be helpful but *not* necessary
to fix a regression.

Regards,

Chuck Zmudzinski

>
> On 5/20/2022 11:46 AM, Chuck Zmudzinski wrote:
>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>
>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>> those want
>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did 
>>>>>>>>>>> inspect them
>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>> observe the
>>>>>>>>>>> adjusted behavior (or else I couldn't have left 
>>>>>>>>>>> pat_enabled() as the
>>>>>>>>>>> only predicate). In fact, as said in the description of my 
>>>>>>>>>>> earlier
>>>>>>>>>>> patch, in
>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map() 
>>>>>>>>>>> to be
>>>>>>>>>>> the
>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>
>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at 
>>>>>>>>>> least
>>>>>>>>>> the
>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>
>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>> should not override that,
>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>> an override would affect only the single domain where the
>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>
>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>> (but tell us "don't do that then").
>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>> That means your presumption (and the presumption of
>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>> having an effect on the driver).
>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>> i915 driver
>>> The driver ought to work fine without PAT (and hence without being
>>> able to make WC mappings). It would use UC instead and be slow, but
>>> it ought to work.
>>
>> I am not an expert, but I think the reason it failed on my box was
>> because of the requirements of CI. Maybe the driver would fall back
>> to UC if the add_taint_for_CI function did not halt the entire system
>> in response to the failed test for PAT when trying to use WC mappings.
>>
>>>> and therefore the driver should refuse to disable
>>>> it if the user requests to disable it and instead warn the user that
>>>> the driver did not disable the feature, contrary to what the user
>>>> requested with the nopat option.
>>>>
>>>> In any case, my test did not verify that when nopat is set in Linux 
>>>> 5.16,
>>>> the thread takes the same code path as when nopat is not set,
>>>> so I am not totally sure that the reason nopat does not break the
>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>> returns true even when nopat is set. I could test it with a custom
>>>> log message in 5.16 if that is necessary.
>>>>
>>>> Are you saying it was wrong for
>>>> to return true in 5.16 when the user requests nopat?
>>> No, I'm not saying that. It was wrong for this construct to be used
>>> in the driver, which was fixed for 5.17 (and which had caused the
>>> regression I did observe, leading to the patch as a hopefully least
>>> bad option).
>>
>> Hmm, the patch I used to fix my box with 5.17.6 used
>> static_cpu_has(X86_FEATURE_PAT) so the driver could
>> continue to configure the hardware using WC. This is the
>> relevant part of the patch I used to fix my box, which includes
>> extra error logs, (against Debian's official build of 5.17.6):
>>
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c    2022-05-09 
>> 03:16:33.000000000 -0400
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c    2022-05-19 
>> 15:55:40.339778818 -0400
>> ...
>> @@ -430,17 +434,23 @@
>>          err = i915_gem_object_wait_moving_fence(obj, true);
>>          if (err) {
>>              ptr = ERR_PTR(err);
>> +            DRM_ERROR("i915_gem_object_wait_moving_fence error, err 
>> = %d\n", err);
>>              goto err_unpin;
>>          }
>>
>> -        if (GEM_WARN_ON(type == I915_MAP_WC && !pat_enabled()))
>> +        if (GEM_WARN_ON(type == I915_MAP_WC &&
>> +                !pat_enabled() && !static_cpu_has(X86_FEATURE_PAT))) {
>> +            DRM_ERROR("type == I915_MAP_WC && !pat_enabled(), err = 
>> %d\n", -ENODEV);
>>              ptr = ERR_PTR(-ENODEV);
>> +        }
>>          else if (i915_gem_object_has_struct_page(obj))
>>              ptr = i915_gem_object_map_page(obj, type);
>>          else
>>              ptr = i915_gem_object_map_pfn(obj, type);
>> -        if (IS_ERR(ptr))
>> +        if (IS_ERR(ptr)) {
>> +            DRM_ERROR("IS_ERR(PTR) is true, returning a (ptr) 
>> error\n");
>>              goto err_unpin;
>> +        }
>>
>>          obj->mm.mapping = page_pack_bits(ptr, type);
>>      }
>>
>> As you can see, adding the static_cpu_has(X86_FEATURE_PAT)
>> function to the test for PAT restored the behavior of 5.16 on the
>> Xen hypervisor to 5.17, and that is how I discovered the solution
>> to this problem on 5.17 on my box.
>>
>>>> I think that is
>>>> just permitting a bad configuration to break the driver that a
>>>> well-written operating system should not allow. The i915 driver
>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>> because that option is not compatible with the hardware the
>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>> that is my understanding now, but I will need to test it on 5.16
>>>> to be sure I understand it correctly.
>>>>
>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>> when nopat is not set on my box, so your patch would
>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>> case when nopat is set on my box. I think your point would
>>>> be that it is not a regression if it is an incorrect user 
>>>> configuration.
>>> Again no - my view is that there's a separate, pre-existing issue
>>> in the driver which was uncovered by the change. This may be a
>>> perceived regression, but is imo different from a real one.
>>
>> Maybe it is only a perceived regression if nopat is set, but
>> imo bdd8b6c98239 introduced a real regression in 5.17
>> relative to 5.16 for the correctly and identically configured
>> case when the nopat option is not set. That is why I still think
>> it should be reverted and the fix backported to 5.17 until the
>> regression for the case when nopat is not set is fixed. As I
>> said before, the i915 driver relies on the memory subsyste
>> to provide it with an accurate test for the x86 pat feature.
>> The test the driver used in bdd8b6c98239 gives the i915 driver
>> a false negative, and that caused a real regression when nopat
>> is not set. bdd8b6c98239 can be re-applied if we apply your
>> patch which corrects the false negative that pat_enabled() is
>> currently providing the i915 driver with. That false negative
>> from pat_enabled() is not an i915 bug, it is a bug in x86/pat.
>>
>> Chuck
>


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-20 14:48                       ` Chuck Zmudzinski
  (?)
@ 2022-05-21 10:47                         ` Thorsten Leemhuis
  -1 siblings, 0 replies; 80+ messages in thread
From: Thorsten Leemhuis @ 2022-05-21 10:47 UTC (permalink / raw)
  To: Chuck Zmudzinski, Jan Beulich, regressions, stable
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross

On 20.05.22 16:48, Chuck Zmudzinski wrote:
> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>
>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>> those want
>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>> inspect them
>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>> observe the
>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>> as the
>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>> earlier
>>>>>>>>>> patch, in
>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>> to be
>>>>>>>>>> the
>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>
>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>> least
>>>>>>>>> the
>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>
>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>> should not override that,
>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>> an override would affect only the single domain where the
>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>
>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>> (but tell us "don't do that then").
>>>>> Actually I just did a test with the last official Debian kernel
>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>> That means your presumption (and the presumption of
>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>> nopat did not break the i915 driver. Why break it now?
>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>> (not breaking the driver of course - that's a driver bug -, but
>>>> having an effect on the driver).
>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>> i915 driver
>> The driver ought to work fine without PAT (and hence without being
>> able to make WC mappings). It would use UC instead and be slow, but
>> it ought to work.
>>
>>> and therefore the driver should refuse to disable
>>> it if the user requests to disable it and instead warn the user that
>>> the driver did not disable the feature, contrary to what the user
>>> requested with the nopat option.
>>>
>>> In any case, my test did not verify that when nopat is set in Linux
>>> 5.16,
>>> the thread takes the same code path as when nopat is not set,
>>> so I am not totally sure that the reason nopat does not break the
>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>> returns true even when nopat is set. I could test it with a custom
>>> log message in 5.16 if that is necessary.
>>>
>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>> to return true in 5.16 when the user requests nopat?
>> No, I'm not saying that. It was wrong for this construct to be used
>> in the driver, which was fixed for 5.17 (and which had caused the
>> regression I did observe, leading to the patch as a hopefully least
>> bad option).
>>
>>> I think that is
>>> just permitting a bad configuration to break the driver that a
>>> well-written operating system should not allow. The i915 driver
>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>> because that option is not compatible with the hardware the
>>> i915 driver is trying to initialize and setup at boot time. At least
>>> that is my understanding now, but I will need to test it on 5.16
>>> to be sure I understand it correctly.
>>>
>>> Also, AFAICT, your patch would break the driver when the nopat
>>> option is set and only fix the regression introduced by bdd8b6c98239
>>> when nopat is not set on my box, so your patch would
>>> introduce a regression relative to Linux 5.16 and earlier for the
>>> case when nopat is set on my box. I think your point would
>>> be that it is not a regression if it is an incorrect user configuration.
>> Again no - my view is that there's a separate, pre-existing issue
>> in the driver which was uncovered by the change. This may be a
>> perceived regression, but is imo different from a real one.

Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
comes to the "no regressions rule". Just took a quick look at quotes
from Linus
https://www.kernel.org/doc/html/latest/process/handling-regressions.html
and found this statement from Linus to back this up:

```
One _particularly_ last-minute revert is the top-most commit (ignoring
the version change itself) done just before the release, and while
it's very annoying, it's perhaps also instructive.

What's instructive about it is that I reverted a commit that wasn't
actually buggy. In fact, it was doing exactly what it set out to do,
and did it very well. In fact it did it _so_ well that the much
improved IO patterns it caused then ended up revealing a user-visible
regression due to a real bug in a completely unrelated area.
```

He said that here:
https://www.kernel.org/doc/html/latest/process/handling-regressions.html

The situation is of course different here, but similar enough.

> Since it is a regression, I think for now bdd8b6c98239 should
> be reverted and the fix backported to Linux 5.17 stable until
> the underlying memory subsystem can provide the i915 driver
> with an updated test for the PAT feature that also meets the
> requirements of the author of bdd8b6c98239 without breaking
> the i915 driver.

I'm not a developer and I'm don't known the details of this thread and
the backstory of the regression, but it sounds like that's the approach
that is needed here until someone comes up with a fix for the regression
exposed by bdd8b6c98239.

But if I'm wrong, please tell me.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

> The i915 driver relies on the memory subsytem
> to provide it with an accurate test for the existence of
> X86_FEATURE_PAT. I think your patch provides that more accurate
> test so that bdd8b6c98239 could be re-applied when your patch is
> committed. Juergen's patch would have to touch bdd8b6c98239
> with new functions that probably have unknown and unintended
> consequences, so I think your approach is also better in that regard.
> As regards your patch, there is just a disagreement about how the
> i915 driver should behave if nopat is set. I agree the i915 driver
> could do a better job handling that case, at least with better error
> logs.
> 
> Chuck
> 
>>
>>> I respond by saying a well-written driver should refuse to honor
>>> the incorrect configuration requested by the user and instead
>>> warn the user that it did not honor the incorrect kernel option.
>>>
>>> I am only presuming what your patch would do on my box based
>>> on what I learned about this problem from my debugging. I can
>>> also test your patch on my box to verify that my understanding of
>>> it is correct.
>>>
>>> I also have not yet verified Juergen's patch will not fix it, but
>>> I am almost certain it will not unless it is expanded so it also
>>> touches i915_gem_object_pin_map() with the fix. I plan to test
>>> his patch, but expanded so it touches that function also.
>>>
>>> I also plan to test your patch with and without nopat and report the
>>> results in the thread where you posted your patch. Hopefully
>>> by tomorrow I will have the results.
>>>
>>> Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-21 10:47                         ` Thorsten Leemhuis
  0 siblings, 0 replies; 80+ messages in thread
From: Thorsten Leemhuis @ 2022-05-21 10:47 UTC (permalink / raw)
  To: Chuck Zmudzinski, Jan Beulich, regressions, stable
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner

On 20.05.22 16:48, Chuck Zmudzinski wrote:
> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>
>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>> those want
>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>> inspect them
>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>> observe the
>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>> as the
>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>> earlier
>>>>>>>>>> patch, in
>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>> to be
>>>>>>>>>> the
>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>
>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>> least
>>>>>>>>> the
>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>
>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>> should not override that,
>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>> an override would affect only the single domain where the
>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>
>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>> (but tell us "don't do that then").
>>>>> Actually I just did a test with the last official Debian kernel
>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>> That means your presumption (and the presumption of
>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>> nopat did not break the i915 driver. Why break it now?
>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>> (not breaking the driver of course - that's a driver bug -, but
>>>> having an effect on the driver).
>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>> i915 driver
>> The driver ought to work fine without PAT (and hence without being
>> able to make WC mappings). It would use UC instead and be slow, but
>> it ought to work.
>>
>>> and therefore the driver should refuse to disable
>>> it if the user requests to disable it and instead warn the user that
>>> the driver did not disable the feature, contrary to what the user
>>> requested with the nopat option.
>>>
>>> In any case, my test did not verify that when nopat is set in Linux
>>> 5.16,
>>> the thread takes the same code path as when nopat is not set,
>>> so I am not totally sure that the reason nopat does not break the
>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>> returns true even when nopat is set. I could test it with a custom
>>> log message in 5.16 if that is necessary.
>>>
>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>> to return true in 5.16 when the user requests nopat?
>> No, I'm not saying that. It was wrong for this construct to be used
>> in the driver, which was fixed for 5.17 (and which had caused the
>> regression I did observe, leading to the patch as a hopefully least
>> bad option).
>>
>>> I think that is
>>> just permitting a bad configuration to break the driver that a
>>> well-written operating system should not allow. The i915 driver
>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>> because that option is not compatible with the hardware the
>>> i915 driver is trying to initialize and setup at boot time. At least
>>> that is my understanding now, but I will need to test it on 5.16
>>> to be sure I understand it correctly.
>>>
>>> Also, AFAICT, your patch would break the driver when the nopat
>>> option is set and only fix the regression introduced by bdd8b6c98239
>>> when nopat is not set on my box, so your patch would
>>> introduce a regression relative to Linux 5.16 and earlier for the
>>> case when nopat is set on my box. I think your point would
>>> be that it is not a regression if it is an incorrect user configuration.
>> Again no - my view is that there's a separate, pre-existing issue
>> in the driver which was uncovered by the change. This may be a
>> perceived regression, but is imo different from a real one.

Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
comes to the "no regressions rule". Just took a quick look at quotes
from Linus
https://www.kernel.org/doc/html/latest/process/handling-regressions.html
and found this statement from Linus to back this up:

```
One _particularly_ last-minute revert is the top-most commit (ignoring
the version change itself) done just before the release, and while
it's very annoying, it's perhaps also instructive.

What's instructive about it is that I reverted a commit that wasn't
actually buggy. In fact, it was doing exactly what it set out to do,
and did it very well. In fact it did it _so_ well that the much
improved IO patterns it caused then ended up revealing a user-visible
regression due to a real bug in a completely unrelated area.
```

He said that here:
https://www.kernel.org/doc/html/latest/process/handling-regressions.html

The situation is of course different here, but similar enough.

> Since it is a regression, I think for now bdd8b6c98239 should
> be reverted and the fix backported to Linux 5.17 stable until
> the underlying memory subsystem can provide the i915 driver
> with an updated test for the PAT feature that also meets the
> requirements of the author of bdd8b6c98239 without breaking
> the i915 driver.

I'm not a developer and I'm don't known the details of this thread and
the backstory of the regression, but it sounds like that's the approach
that is needed here until someone comes up with a fix for the regression
exposed by bdd8b6c98239.

But if I'm wrong, please tell me.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

> The i915 driver relies on the memory subsytem
> to provide it with an accurate test for the existence of
> X86_FEATURE_PAT. I think your patch provides that more accurate
> test so that bdd8b6c98239 could be re-applied when your patch is
> committed. Juergen's patch would have to touch bdd8b6c98239
> with new functions that probably have unknown and unintended
> consequences, so I think your approach is also better in that regard.
> As regards your patch, there is just a disagreement about how the
> i915 driver should behave if nopat is set. I agree the i915 driver
> could do a better job handling that case, at least with better error
> logs.
> 
> Chuck
> 
>>
>>> I respond by saying a well-written driver should refuse to honor
>>> the incorrect configuration requested by the user and instead
>>> warn the user that it did not honor the incorrect kernel option.
>>>
>>> I am only presuming what your patch would do on my box based
>>> on what I learned about this problem from my debugging. I can
>>> also test your patch on my box to verify that my understanding of
>>> it is correct.
>>>
>>> I also have not yet verified Juergen's patch will not fix it, but
>>> I am almost certain it will not unless it is expanded so it also
>>> touches i915_gem_object_pin_map() with the fix. I plan to test
>>> his patch, but expanded so it touches that function also.
>>>
>>> I also plan to test your patch with and without nopat and report the
>>> results in the thread where you posted your patch. Hopefully
>>> by tomorrow I will have the results.
>>>
>>> Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-21 10:47                         ` Thorsten Leemhuis
  0 siblings, 0 replies; 80+ messages in thread
From: Thorsten Leemhuis @ 2022-05-21 10:47 UTC (permalink / raw)
  To: Chuck Zmudzinski, Jan Beulich, regressions, stable
  Cc: Juergen Gross, Peter Zijlstra, intel-gfx, Dave Hansen, x86,
	linux-kernel, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, dri-devel, Andy Lutomirski, H. Peter Anvin,
	xen-devel, Thomas Gleixner

On 20.05.22 16:48, Chuck Zmudzinski wrote:
> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>
>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>> those want
>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>> inspect them
>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>> observe the
>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>> as the
>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>> earlier
>>>>>>>>>> patch, in
>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>> to be
>>>>>>>>>> the
>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>
>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>> least
>>>>>>>>> the
>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>
>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>> should not override that,
>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>> an override would affect only the single domain where the
>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>
>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>> (but tell us "don't do that then").
>>>>> Actually I just did a test with the last official Debian kernel
>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>> That means your presumption (and the presumption of
>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>> nopat did not break the i915 driver. Why break it now?
>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>> (not breaking the driver of course - that's a driver bug -, but
>>>> having an effect on the driver).
>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>> i915 driver
>> The driver ought to work fine without PAT (and hence without being
>> able to make WC mappings). It would use UC instead and be slow, but
>> it ought to work.
>>
>>> and therefore the driver should refuse to disable
>>> it if the user requests to disable it and instead warn the user that
>>> the driver did not disable the feature, contrary to what the user
>>> requested with the nopat option.
>>>
>>> In any case, my test did not verify that when nopat is set in Linux
>>> 5.16,
>>> the thread takes the same code path as when nopat is not set,
>>> so I am not totally sure that the reason nopat does not break the
>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>> returns true even when nopat is set. I could test it with a custom
>>> log message in 5.16 if that is necessary.
>>>
>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>> to return true in 5.16 when the user requests nopat?
>> No, I'm not saying that. It was wrong for this construct to be used
>> in the driver, which was fixed for 5.17 (and which had caused the
>> regression I did observe, leading to the patch as a hopefully least
>> bad option).
>>
>>> I think that is
>>> just permitting a bad configuration to break the driver that a
>>> well-written operating system should not allow. The i915 driver
>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>> because that option is not compatible with the hardware the
>>> i915 driver is trying to initialize and setup at boot time. At least
>>> that is my understanding now, but I will need to test it on 5.16
>>> to be sure I understand it correctly.
>>>
>>> Also, AFAICT, your patch would break the driver when the nopat
>>> option is set and only fix the regression introduced by bdd8b6c98239
>>> when nopat is not set on my box, so your patch would
>>> introduce a regression relative to Linux 5.16 and earlier for the
>>> case when nopat is set on my box. I think your point would
>>> be that it is not a regression if it is an incorrect user configuration.
>> Again no - my view is that there's a separate, pre-existing issue
>> in the driver which was uncovered by the change. This may be a
>> perceived regression, but is imo different from a real one.

Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
comes to the "no regressions rule". Just took a quick look at quotes
from Linus
https://www.kernel.org/doc/html/latest/process/handling-regressions.html
and found this statement from Linus to back this up:

```
One _particularly_ last-minute revert is the top-most commit (ignoring
the version change itself) done just before the release, and while
it's very annoying, it's perhaps also instructive.

What's instructive about it is that I reverted a commit that wasn't
actually buggy. In fact, it was doing exactly what it set out to do,
and did it very well. In fact it did it _so_ well that the much
improved IO patterns it caused then ended up revealing a user-visible
regression due to a real bug in a completely unrelated area.
```

He said that here:
https://www.kernel.org/doc/html/latest/process/handling-regressions.html

The situation is of course different here, but similar enough.

> Since it is a regression, I think for now bdd8b6c98239 should
> be reverted and the fix backported to Linux 5.17 stable until
> the underlying memory subsystem can provide the i915 driver
> with an updated test for the PAT feature that also meets the
> requirements of the author of bdd8b6c98239 without breaking
> the i915 driver.

I'm not a developer and I'm don't known the details of this thread and
the backstory of the regression, but it sounds like that's the approach
that is needed here until someone comes up with a fix for the regression
exposed by bdd8b6c98239.

But if I'm wrong, please tell me.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

> The i915 driver relies on the memory subsytem
> to provide it with an accurate test for the existence of
> X86_FEATURE_PAT. I think your patch provides that more accurate
> test so that bdd8b6c98239 could be re-applied when your patch is
> committed. Juergen's patch would have to touch bdd8b6c98239
> with new functions that probably have unknown and unintended
> consequences, so I think your approach is also better in that regard.
> As regards your patch, there is just a disagreement about how the
> i915 driver should behave if nopat is set. I agree the i915 driver
> could do a better job handling that case, at least with better error
> logs.
> 
> Chuck
> 
>>
>>> I respond by saying a well-written driver should refuse to honor
>>> the incorrect configuration requested by the user and instead
>>> warn the user that it did not honor the incorrect kernel option.
>>>
>>> I am only presuming what your patch would do on my box based
>>> on what I learned about this problem from my debugging. I can
>>> also test your patch on my box to verify that my understanding of
>>> it is correct.
>>>
>>> I also have not yet verified Juergen's patch will not fix it, but
>>> I am almost certain it will not unless it is expanded so it also
>>> touches i915_gem_object_pin_map() with the fix. I plan to test
>>> his patch, but expanded so it touches that function also.
>>>
>>> I also plan to test your patch with and without nopat and report the
>>> results in the thread where you posted your patch. Hopefully
>>> by tomorrow I will have the results.
>>>
>>> Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-03 13:22   ` Juergen Gross
@ 2022-05-21 13:24     ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-21 13:24 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, x86, linux-kernel, intel-gfx, dri-devel
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, Jan Beulich

On 5/3/22 9:22 AM, Juergen Gross wrote:
> Some drivers are using pat_enabled() in order to test availability of
> special caching modes (WC and UC-). This will lead to false negatives
> in case the system was booted e.g. with the "nopat" variant and the
> BIOS did setup the PAT MSR supporting the queried mode, or if the
> system is running as a Xen PV guest.
>
> Add test functions for those caching modes instead and use them at the
> appropriate places.
>
> For symmetry reasons export the already existing x86_has_pat_wp() for
> modules, too.
>
> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
> Signed-off-by: Juergen Gross<jgross@suse.com>
> ---
>   arch/x86/include/asm/memtype.h           |  2 ++
>   arch/x86/include/asm/pci.h               |  2 +-
>   arch/x86/mm/init.c                       | 25 +++++++++++++++++++++---
>   drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 ++++----
>   4 files changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/memtype.h b/arch/x86/include/asm/memtype.h
> index 9ca760e430b9..d00e0be854d4 100644
> --- a/arch/x86/include/asm/memtype.h
> +++ b/arch/x86/include/asm/memtype.h
> @@ -25,6 +25,8 @@ extern void memtype_free_io(resource_size_t start, resource_size_t end);
>   extern bool pat_pfn_immune_to_uc_mtrr(unsigned long pfn);
>   
>   bool x86_has_pat_wp(void);
> +bool x86_has_pat_wc(void);
> +bool x86_has_pat_uc_minus(void);
>   enum page_cache_mode pgprot2cachemode(pgprot_t pgprot);
>   
>   #endif /* _ASM_X86_MEMTYPE_H */
> diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
> index f3fd5928bcbb..a5742268dec1 100644
> --- a/arch/x86/include/asm/pci.h
> +++ b/arch/x86/include/asm/pci.h
> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
>   
>   
>   #define HAVE_PCI_MMAP
> -#define arch_can_pci_mmap_wc()	pat_enabled()
> +#define arch_can_pci_mmap_wc()	x86_has_pat_wc()
>   #define ARCH_GENERIC_PCI_MMAP_RESOURCE
>   
>   #ifdef CONFIG_PCI
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index 71e182ebced3..b6431f714dc2 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -77,12 +77,31 @@ static uint8_t __pte2cachemode_tbl[8] = {
>   	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
>   };
>   
> -/* Check that the write-protect PAT entry is set for write-protect */
> +static bool x86_has_pat_mode(unsigned int mode)
> +{
> +	return __pte2cachemode_tbl[__cachemode2pte_tbl[mode]] == mode;
> +}
> +
> +/* Check that PAT supports write-protect */
>   bool x86_has_pat_wp(void)
>   {
> -	return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
> -	       _PAGE_CACHE_MODE_WP;
> +	return x86_has_pat_mode(_PAGE_CACHE_MODE_WP);
> +}
> +EXPORT_SYMBOL_GPL(x86_has_pat_wp);
> +
> +/* Check that PAT supports WC */
> +bool x86_has_pat_wc(void)
> +{
> +	return x86_has_pat_mode(_PAGE_CACHE_MODE_WC);
> +}
> +EXPORT_SYMBOL_GPL(x86_has_pat_wc);
> +
> +/* Check that PAT supports UC- */
> +bool x86_has_pat_uc_minus(void)
> +{
> +	return x86_has_pat_mode(_PAGE_CACHE_MODE_UC_MINUS);
>   }
> +EXPORT_SYMBOL_GPL(x86_has_pat_uc_minus);
>   
>   enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
>   {
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> index 0c5c43852e24..f43ecf3f63eb 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>   	if (args->flags & ~(I915_MMAP_WC))
>   		return -EINVAL;
>   
> -	if (args->flags & I915_MMAP_WC && !pat_enabled())
> +	if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>   		return -ENODEV;
>   
>   	obj = i915_gem_object_lookup(file, args->handle);
> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>   
>   	if (HAS_LMEM(to_i915(dev)))
>   		mmap_type = I915_MMAP_TYPE_FIXED;
> -	else if (pat_enabled())
> +	else if (x86_has_pat_wc())
>   		mmap_type = I915_MMAP_TYPE_WC;
>   	else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>   		return -ENODEV;
> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>   		break;
>   
>   	case I915_MMAP_OFFSET_WC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_wc())
>   			return -ENODEV;
>   		type = I915_MMAP_TYPE_WC;
>   		break;
> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>   		break;
>   
>   	case I915_MMAP_OFFSET_UC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_uc_minus())
>   			return -ENODEV;
>   		type = I915_MMAP_TYPE_UC;
>   		break;

This patch is advertised as a fix for
bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")

bdd8b6c98239 causes a serious regression on my system when
running Linux as a Dom0 on Xen.

The regression is that on my system, the error caused by this issue
causes the i915driver to call its add_taint_for_CI function, which
in turn totally halts the system during early boot. So this makes
it impossible for either 5.17.y or the 5.18-rc versions to run
as a Dom0 on my system. I cannot upgrade my system to the 5.17.y
or to 5.18-rc versions without a proper fix for bdd8b6c98239.

I did some testing with this patch on my system (my tests included
the first patch of this 2-patch series), and here are the results:

This patch does *not* fix it. I expected this patch, as is, to not
fix it but allow me to add a simple patch that uses the new
x86_has_pat_wc() function provided by this patch to the
i915_gem_object_pin_map() function in i915_gem_pages.c
that would fix it.

However, even by adding the following simple patch to the
i915_gem_object_pin_map() function to the patch, the
patch series still does *not* fix the regression on my system:

--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -428,7 +428,7 @@
              goto err_unpin;
          }

-        if (GEM_WARN_ON(type == I915_MAP_WC && !pat_enabled()))
+        if (GEM_WARN_ON(type == I915_MAP_WC && !x86_has_pat_wc()))
              ptr = ERR_PTR(-ENODEV);
          else if (i915_gem_object_has_struct_page(obj))
              ptr = i915_gem_object_map_page(obj, type);

I verified that this is the function where pat_enabled() is returning
a false negative on my system.

This means x86_has_pat_wc() is still giving me a false negative, even
when running as a Xen Dom0. I am not sure you understand what is
really causing the problem Jan is trying to fix here with false
negatives from pat_enabled(). I also tested Jan's patch that
you are trying to replace with this patch, and his patch *does* fix
the problem on my system. Jan's patch is very simple and solves the
problem by editing pat_enabled() so that it returns true if
boot_cpu_has(X86_FEATURE_HYPERVISOR)) is true after the
other checks for the x86 pat feature failed.

I expect you do not have a system that actually has the problem
that Jan and I are trying to fix because the problem only exists on
systems with specific hardware, and in my case it is an Intel Haswell
CPU with integrated GPU. You might be able to test your patch,
though, if you boot the patched kernel with the nopat option and
check if your new functions return false when running on the bare
metal and true when running in a Dom0 on the Xen hypervisor. That is
what the new functions should do. I think you were expecting your
new x86_has_pat_wc() function to return true when Linux is running
as a Dom0 on the Xen hypervisor even when pat_enabled() returns
false. But it does not seem to be working.

In any case, after testing this patch, I cannot confirm that it
fixes bdd8b6c98239.

Best regards,

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-21 13:24     ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-21 13:24 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, x86, linux-kernel, intel-gfx, dri-devel
  Cc: Tvrtko Ursulin, Jan Beulich, Peter Zijlstra, Dave Hansen,
	David Airlie, Rodrigo Vivi, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, Thomas Gleixner

On 5/3/22 9:22 AM, Juergen Gross wrote:
> Some drivers are using pat_enabled() in order to test availability of
> special caching modes (WC and UC-). This will lead to false negatives
> in case the system was booted e.g. with the "nopat" variant and the
> BIOS did setup the PAT MSR supporting the queried mode, or if the
> system is running as a Xen PV guest.
>
> Add test functions for those caching modes instead and use them at the
> appropriate places.
>
> For symmetry reasons export the already existing x86_has_pat_wp() for
> modules, too.
>
> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
> Signed-off-by: Juergen Gross<jgross@suse.com>
> ---
>   arch/x86/include/asm/memtype.h           |  2 ++
>   arch/x86/include/asm/pci.h               |  2 +-
>   arch/x86/mm/init.c                       | 25 +++++++++++++++++++++---
>   drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 ++++----
>   4 files changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/memtype.h b/arch/x86/include/asm/memtype.h
> index 9ca760e430b9..d00e0be854d4 100644
> --- a/arch/x86/include/asm/memtype.h
> +++ b/arch/x86/include/asm/memtype.h
> @@ -25,6 +25,8 @@ extern void memtype_free_io(resource_size_t start, resource_size_t end);
>   extern bool pat_pfn_immune_to_uc_mtrr(unsigned long pfn);
>   
>   bool x86_has_pat_wp(void);
> +bool x86_has_pat_wc(void);
> +bool x86_has_pat_uc_minus(void);
>   enum page_cache_mode pgprot2cachemode(pgprot_t pgprot);
>   
>   #endif /* _ASM_X86_MEMTYPE_H */
> diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
> index f3fd5928bcbb..a5742268dec1 100644
> --- a/arch/x86/include/asm/pci.h
> +++ b/arch/x86/include/asm/pci.h
> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
>   
>   
>   #define HAVE_PCI_MMAP
> -#define arch_can_pci_mmap_wc()	pat_enabled()
> +#define arch_can_pci_mmap_wc()	x86_has_pat_wc()
>   #define ARCH_GENERIC_PCI_MMAP_RESOURCE
>   
>   #ifdef CONFIG_PCI
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index 71e182ebced3..b6431f714dc2 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -77,12 +77,31 @@ static uint8_t __pte2cachemode_tbl[8] = {
>   	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
>   };
>   
> -/* Check that the write-protect PAT entry is set for write-protect */
> +static bool x86_has_pat_mode(unsigned int mode)
> +{
> +	return __pte2cachemode_tbl[__cachemode2pte_tbl[mode]] == mode;
> +}
> +
> +/* Check that PAT supports write-protect */
>   bool x86_has_pat_wp(void)
>   {
> -	return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
> -	       _PAGE_CACHE_MODE_WP;
> +	return x86_has_pat_mode(_PAGE_CACHE_MODE_WP);
> +}
> +EXPORT_SYMBOL_GPL(x86_has_pat_wp);
> +
> +/* Check that PAT supports WC */
> +bool x86_has_pat_wc(void)
> +{
> +	return x86_has_pat_mode(_PAGE_CACHE_MODE_WC);
> +}
> +EXPORT_SYMBOL_GPL(x86_has_pat_wc);
> +
> +/* Check that PAT supports UC- */
> +bool x86_has_pat_uc_minus(void)
> +{
> +	return x86_has_pat_mode(_PAGE_CACHE_MODE_UC_MINUS);
>   }
> +EXPORT_SYMBOL_GPL(x86_has_pat_uc_minus);
>   
>   enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
>   {
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> index 0c5c43852e24..f43ecf3f63eb 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>   	if (args->flags & ~(I915_MMAP_WC))
>   		return -EINVAL;
>   
> -	if (args->flags & I915_MMAP_WC && !pat_enabled())
> +	if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>   		return -ENODEV;
>   
>   	obj = i915_gem_object_lookup(file, args->handle);
> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>   
>   	if (HAS_LMEM(to_i915(dev)))
>   		mmap_type = I915_MMAP_TYPE_FIXED;
> -	else if (pat_enabled())
> +	else if (x86_has_pat_wc())
>   		mmap_type = I915_MMAP_TYPE_WC;
>   	else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>   		return -ENODEV;
> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>   		break;
>   
>   	case I915_MMAP_OFFSET_WC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_wc())
>   			return -ENODEV;
>   		type = I915_MMAP_TYPE_WC;
>   		break;
> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>   		break;
>   
>   	case I915_MMAP_OFFSET_UC:
> -		if (!pat_enabled())
> +		if (!x86_has_pat_uc_minus())
>   			return -ENODEV;
>   		type = I915_MMAP_TYPE_UC;
>   		break;

This patch is advertised as a fix for
bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")

bdd8b6c98239 causes a serious regression on my system when
running Linux as a Dom0 on Xen.

The regression is that on my system, the error caused by this issue
causes the i915driver to call its add_taint_for_CI function, which
in turn totally halts the system during early boot. So this makes
it impossible for either 5.17.y or the 5.18-rc versions to run
as a Dom0 on my system. I cannot upgrade my system to the 5.17.y
or to 5.18-rc versions without a proper fix for bdd8b6c98239.

I did some testing with this patch on my system (my tests included
the first patch of this 2-patch series), and here are the results:

This patch does *not* fix it. I expected this patch, as is, to not
fix it but allow me to add a simple patch that uses the new
x86_has_pat_wc() function provided by this patch to the
i915_gem_object_pin_map() function in i915_gem_pages.c
that would fix it.

However, even by adding the following simple patch to the
i915_gem_object_pin_map() function to the patch, the
patch series still does *not* fix the regression on my system:

--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -428,7 +428,7 @@
              goto err_unpin;
          }

-        if (GEM_WARN_ON(type == I915_MAP_WC && !pat_enabled()))
+        if (GEM_WARN_ON(type == I915_MAP_WC && !x86_has_pat_wc()))
              ptr = ERR_PTR(-ENODEV);
          else if (i915_gem_object_has_struct_page(obj))
              ptr = i915_gem_object_map_page(obj, type);

I verified that this is the function where pat_enabled() is returning
a false negative on my system.

This means x86_has_pat_wc() is still giving me a false negative, even
when running as a Xen Dom0. I am not sure you understand what is
really causing the problem Jan is trying to fix here with false
negatives from pat_enabled(). I also tested Jan's patch that
you are trying to replace with this patch, and his patch *does* fix
the problem on my system. Jan's patch is very simple and solves the
problem by editing pat_enabled() so that it returns true if
boot_cpu_has(X86_FEATURE_HYPERVISOR)) is true after the
other checks for the x86 pat feature failed.

I expect you do not have a system that actually has the problem
that Jan and I are trying to fix because the problem only exists on
systems with specific hardware, and in my case it is an Intel Haswell
CPU with integrated GPU. You might be able to test your patch,
though, if you boot the patched kernel with the nopat option and
check if your new functions return false when running on the bare
metal and true when running in a Dom0 on the Xen hypervisor. That is
what the new functions should do. I think you were expecting your
new x86_has_pat_wc() function to return true when Linux is running
as a Dom0 on the Xen hypervisor even when pat_enabled() returns
false. But it does not seem to be working.

In any case, after testing this patch, I cannot confirm that it
fixes bdd8b6c98239.

Best regards,

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-21 10:47                         ` Thorsten Leemhuis
  (?)
@ 2022-05-24 18:32                           ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-24 18:32 UTC (permalink / raw)
  To: Thorsten Leemhuis, Jan Beulich, regressions, stable
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross

On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
> On 20.05.22 16:48, Chuck Zmudzinski wrote:
>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>
>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>> those want
>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>>> inspect them
>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>> observe the
>>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>>> as the
>>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>>> earlier
>>>>>>>>>>> patch, in
>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>>> to be
>>>>>>>>>>> the
>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>
>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>>> least
>>>>>>>>>> the
>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>
>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>> should not override that,
>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>> an override would affect only the single domain where the
>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>
>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>> (but tell us "don't do that then").
>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>> That means your presumption (and the presumption of
>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>> having an effect on the driver).
>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>> i915 driver
>>> The driver ought to work fine without PAT (and hence without being
>>> able to make WC mappings). It would use UC instead and be slow, but
>>> it ought to work.
>>>
>>>> and therefore the driver should refuse to disable
>>>> it if the user requests to disable it and instead warn the user that
>>>> the driver did not disable the feature, contrary to what the user
>>>> requested with the nopat option.
>>>>
>>>> In any case, my test did not verify that when nopat is set in Linux
>>>> 5.16,
>>>> the thread takes the same code path as when nopat is not set,
>>>> so I am not totally sure that the reason nopat does not break the
>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>> returns true even when nopat is set. I could test it with a custom
>>>> log message in 5.16 if that is necessary.
>>>>
>>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>>> to return true in 5.16 when the user requests nopat?
>>> No, I'm not saying that. It was wrong for this construct to be used
>>> in the driver, which was fixed for 5.17 (and which had caused the
>>> regression I did observe, leading to the patch as a hopefully least
>>> bad option).
>>>
>>>> I think that is
>>>> just permitting a bad configuration to break the driver that a
>>>> well-written operating system should not allow. The i915 driver
>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>> because that option is not compatible with the hardware the
>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>> that is my understanding now, but I will need to test it on 5.16
>>>> to be sure I understand it correctly.
>>>>
>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>> when nopat is not set on my box, so your patch would
>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>> case when nopat is set on my box. I think your point would
>>>> be that it is not a regression if it is an incorrect user configuration.
>>> Again no - my view is that there's a separate, pre-existing issue
>>> in the driver which was uncovered by the change. This may be a
>>> perceived regression, but is imo different from a real one.
> Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
> comes to the "no regressions rule". Just took a quick look at quotes
> from Linus
> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
> and found this statement from Linus to back this up:
>
> ```
> One _particularly_ last-minute revert is the top-most commit (ignoring
> the version change itself) done just before the release, and while
> it's very annoying, it's perhaps also instructive.
>
> What's instructive about it is that I reverted a commit that wasn't
> actually buggy. In fact, it was doing exactly what it set out to do,
> and did it very well. In fact it did it _so_ well that the much
> improved IO patterns it caused then ended up revealing a user-visible
> regression due to a real bug in a completely unrelated area.
> ```
>
> He said that here:
> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>
> The situation is of course different here, but similar enough.
>
>> Since it is a regression, I think for now bdd8b6c98239 should
>> be reverted and the fix backported to Linux 5.17 stable until
>> the underlying memory subsystem can provide the i915 driver
>> with an updated test for the PAT feature that also meets the
>> requirements of the author of bdd8b6c98239 without breaking
>> the i915 driver.
> I'm not a developer and I'm don't known the details of this thread and
> the backstory of the regression, but it sounds like that's the approach
> that is needed here until someone comes up with a fix for the regression
> exposed by bdd8b6c98239.
>
> But if I'm wrong, please tell me.

You are mostly right, I think. Reverting bdd8b6c98239 fixes
it. There is another way to fix it, though. The patch proposed
by Jan Beulich also fixes the regression on my system, so as
the person reporting this is a regression, I would also be satisfied
with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
posted his proposed patch here:

https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/

The only reservation I have about Jan's patch is that the commit
message does not clearly explain how the patch changes what
the nopat kernel boot option does. It doesn't affect me because
I don't use nopat, but it should probably be mentioned in the
commit message, as pointed out here:

https://lore.kernel.org/lkml/bd9ed2c2-1337-27bb-c9da-dfc7b31d492c@netscape.net/

Whatever fix for the regression exposed by bdd8b6c98239 also
needs to be backported to the stable versions 5.17 and 5.18.

Regards,

Chuck Zmudzinski
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>
> P.S.: As the Linux kernel's regression tracker I deal with a lot of
> reports and sometimes miss something important when writing mails like
> this. If that's the case here, don't hesitate to tell me in a public
> reply, it's in everyone's interest to set the public record straight.
>
>> The i915 driver relies on the memory subsytem
>> to provide it with an accurate test for the existence of
>> X86_FEATURE_PAT. I think your patch provides that more accurate
>> test so that bdd8b6c98239 could be re-applied when your patch is
>> committed. Juergen's patch would have to touch bdd8b6c98239
>> with new functions that probably have unknown and unintended
>> consequences, so I think your approach is also better in that regard.
>> As regards your patch, there is just a disagreement about how the
>> i915 driver should behave if nopat is set. I agree the i915 driver
>> could do a better job handling that case, at least with better error
>> logs.
>>
>> Chuck
>>
>>>> I respond by saying a well-written driver should refuse to honor
>>>> the incorrect configuration requested by the user and instead
>>>> warn the user that it did not honor the incorrect kernel option.
>>>>
>>>> I am only presuming what your patch would do on my box based
>>>> on what I learned about this problem from my debugging. I can
>>>> also test your patch on my box to verify that my understanding of
>>>> it is correct.
>>>>
>>>> I also have not yet verified Juergen's patch will not fix it, but
>>>> I am almost certain it will not unless it is expanded so it also
>>>> touches i915_gem_object_pin_map() with the fix. I plan to test
>>>> his patch, but expanded so it touches that function also.
>>>>
>>>> I also plan to test your patch with and without nopat and report the
>>>> results in the thread where you posted your patch. Hopefully
>>>> by tomorrow I will have the results.
>>>>
>>>> Chuck


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-24 18:32                           ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-24 18:32 UTC (permalink / raw)
  To: Thorsten Leemhuis, Jan Beulich, regressions, stable
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner

On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
> On 20.05.22 16:48, Chuck Zmudzinski wrote:
>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>
>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>> those want
>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>>> inspect them
>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>> observe the
>>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>>> as the
>>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>>> earlier
>>>>>>>>>>> patch, in
>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>>> to be
>>>>>>>>>>> the
>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>
>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>>> least
>>>>>>>>>> the
>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>
>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>> should not override that,
>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>> an override would affect only the single domain where the
>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>
>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>> (but tell us "don't do that then").
>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>> That means your presumption (and the presumption of
>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>> having an effect on the driver).
>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>> i915 driver
>>> The driver ought to work fine without PAT (and hence without being
>>> able to make WC mappings). It would use UC instead and be slow, but
>>> it ought to work.
>>>
>>>> and therefore the driver should refuse to disable
>>>> it if the user requests to disable it and instead warn the user that
>>>> the driver did not disable the feature, contrary to what the user
>>>> requested with the nopat option.
>>>>
>>>> In any case, my test did not verify that when nopat is set in Linux
>>>> 5.16,
>>>> the thread takes the same code path as when nopat is not set,
>>>> so I am not totally sure that the reason nopat does not break the
>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>> returns true even when nopat is set. I could test it with a custom
>>>> log message in 5.16 if that is necessary.
>>>>
>>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>>> to return true in 5.16 when the user requests nopat?
>>> No, I'm not saying that. It was wrong for this construct to be used
>>> in the driver, which was fixed for 5.17 (and which had caused the
>>> regression I did observe, leading to the patch as a hopefully least
>>> bad option).
>>>
>>>> I think that is
>>>> just permitting a bad configuration to break the driver that a
>>>> well-written operating system should not allow. The i915 driver
>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>> because that option is not compatible with the hardware the
>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>> that is my understanding now, but I will need to test it on 5.16
>>>> to be sure I understand it correctly.
>>>>
>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>> when nopat is not set on my box, so your patch would
>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>> case when nopat is set on my box. I think your point would
>>>> be that it is not a regression if it is an incorrect user configuration.
>>> Again no - my view is that there's a separate, pre-existing issue
>>> in the driver which was uncovered by the change. This may be a
>>> perceived regression, but is imo different from a real one.
> Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
> comes to the "no regressions rule". Just took a quick look at quotes
> from Linus
> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
> and found this statement from Linus to back this up:
>
> ```
> One _particularly_ last-minute revert is the top-most commit (ignoring
> the version change itself) done just before the release, and while
> it's very annoying, it's perhaps also instructive.
>
> What's instructive about it is that I reverted a commit that wasn't
> actually buggy. In fact, it was doing exactly what it set out to do,
> and did it very well. In fact it did it _so_ well that the much
> improved IO patterns it caused then ended up revealing a user-visible
> regression due to a real bug in a completely unrelated area.
> ```
>
> He said that here:
> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>
> The situation is of course different here, but similar enough.
>
>> Since it is a regression, I think for now bdd8b6c98239 should
>> be reverted and the fix backported to Linux 5.17 stable until
>> the underlying memory subsystem can provide the i915 driver
>> with an updated test for the PAT feature that also meets the
>> requirements of the author of bdd8b6c98239 without breaking
>> the i915 driver.
> I'm not a developer and I'm don't known the details of this thread and
> the backstory of the regression, but it sounds like that's the approach
> that is needed here until someone comes up with a fix for the regression
> exposed by bdd8b6c98239.
>
> But if I'm wrong, please tell me.

You are mostly right, I think. Reverting bdd8b6c98239 fixes
it. There is another way to fix it, though. The patch proposed
by Jan Beulich also fixes the regression on my system, so as
the person reporting this is a regression, I would also be satisfied
with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
posted his proposed patch here:

https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/

The only reservation I have about Jan's patch is that the commit
message does not clearly explain how the patch changes what
the nopat kernel boot option does. It doesn't affect me because
I don't use nopat, but it should probably be mentioned in the
commit message, as pointed out here:

https://lore.kernel.org/lkml/bd9ed2c2-1337-27bb-c9da-dfc7b31d492c@netscape.net/

Whatever fix for the regression exposed by bdd8b6c98239 also
needs to be backported to the stable versions 5.17 and 5.18.

Regards,

Chuck Zmudzinski
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>
> P.S.: As the Linux kernel's regression tracker I deal with a lot of
> reports and sometimes miss something important when writing mails like
> this. If that's the case here, don't hesitate to tell me in a public
> reply, it's in everyone's interest to set the public record straight.
>
>> The i915 driver relies on the memory subsytem
>> to provide it with an accurate test for the existence of
>> X86_FEATURE_PAT. I think your patch provides that more accurate
>> test so that bdd8b6c98239 could be re-applied when your patch is
>> committed. Juergen's patch would have to touch bdd8b6c98239
>> with new functions that probably have unknown and unintended
>> consequences, so I think your approach is also better in that regard.
>> As regards your patch, there is just a disagreement about how the
>> i915 driver should behave if nopat is set. I agree the i915 driver
>> could do a better job handling that case, at least with better error
>> logs.
>>
>> Chuck
>>
>>>> I respond by saying a well-written driver should refuse to honor
>>>> the incorrect configuration requested by the user and instead
>>>> warn the user that it did not honor the incorrect kernel option.
>>>>
>>>> I am only presuming what your patch would do on my box based
>>>> on what I learned about this problem from my debugging. I can
>>>> also test your patch on my box to verify that my understanding of
>>>> it is correct.
>>>>
>>>> I also have not yet verified Juergen's patch will not fix it, but
>>>> I am almost certain it will not unless it is expanded so it also
>>>> touches i915_gem_object_pin_map() with the fix. I plan to test
>>>> his patch, but expanded so it touches that function also.
>>>>
>>>> I also plan to test your patch with and without nopat and report the
>>>> results in the thread where you posted your patch. Hopefully
>>>> by tomorrow I will have the results.
>>>>
>>>> Chuck


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-24 18:32                           ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-24 18:32 UTC (permalink / raw)
  To: Thorsten Leemhuis, Jan Beulich, regressions, stable
  Cc: Juergen Gross, Peter Zijlstra, intel-gfx, Dave Hansen, x86,
	linux-kernel, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, dri-devel, Andy Lutomirski, H. Peter Anvin,
	xen-devel, Thomas Gleixner

On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
> On 20.05.22 16:48, Chuck Zmudzinski wrote:
>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>
>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>> those want
>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>>> inspect them
>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>> observe the
>>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>>> as the
>>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>>> earlier
>>>>>>>>>>> patch, in
>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>>> to be
>>>>>>>>>>> the
>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>
>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>>> least
>>>>>>>>>> the
>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>
>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>> should not override that,
>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>> an override would affect only the single domain where the
>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>
>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>> (but tell us "don't do that then").
>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>> That means your presumption (and the presumption of
>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>> having an effect on the driver).
>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>> i915 driver
>>> The driver ought to work fine without PAT (and hence without being
>>> able to make WC mappings). It would use UC instead and be slow, but
>>> it ought to work.
>>>
>>>> and therefore the driver should refuse to disable
>>>> it if the user requests to disable it and instead warn the user that
>>>> the driver did not disable the feature, contrary to what the user
>>>> requested with the nopat option.
>>>>
>>>> In any case, my test did not verify that when nopat is set in Linux
>>>> 5.16,
>>>> the thread takes the same code path as when nopat is not set,
>>>> so I am not totally sure that the reason nopat does not break the
>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>> returns true even when nopat is set. I could test it with a custom
>>>> log message in 5.16 if that is necessary.
>>>>
>>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>>> to return true in 5.16 when the user requests nopat?
>>> No, I'm not saying that. It was wrong for this construct to be used
>>> in the driver, which was fixed for 5.17 (and which had caused the
>>> regression I did observe, leading to the patch as a hopefully least
>>> bad option).
>>>
>>>> I think that is
>>>> just permitting a bad configuration to break the driver that a
>>>> well-written operating system should not allow. The i915 driver
>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>> because that option is not compatible with the hardware the
>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>> that is my understanding now, but I will need to test it on 5.16
>>>> to be sure I understand it correctly.
>>>>
>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>> when nopat is not set on my box, so your patch would
>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>> case when nopat is set on my box. I think your point would
>>>> be that it is not a regression if it is an incorrect user configuration.
>>> Again no - my view is that there's a separate, pre-existing issue
>>> in the driver which was uncovered by the change. This may be a
>>> perceived regression, but is imo different from a real one.
> Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
> comes to the "no regressions rule". Just took a quick look at quotes
> from Linus
> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
> and found this statement from Linus to back this up:
>
> ```
> One _particularly_ last-minute revert is the top-most commit (ignoring
> the version change itself) done just before the release, and while
> it's very annoying, it's perhaps also instructive.
>
> What's instructive about it is that I reverted a commit that wasn't
> actually buggy. In fact, it was doing exactly what it set out to do,
> and did it very well. In fact it did it _so_ well that the much
> improved IO patterns it caused then ended up revealing a user-visible
> regression due to a real bug in a completely unrelated area.
> ```
>
> He said that here:
> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>
> The situation is of course different here, but similar enough.
>
>> Since it is a regression, I think for now bdd8b6c98239 should
>> be reverted and the fix backported to Linux 5.17 stable until
>> the underlying memory subsystem can provide the i915 driver
>> with an updated test for the PAT feature that also meets the
>> requirements of the author of bdd8b6c98239 without breaking
>> the i915 driver.
> I'm not a developer and I'm don't known the details of this thread and
> the backstory of the regression, but it sounds like that's the approach
> that is needed here until someone comes up with a fix for the regression
> exposed by bdd8b6c98239.
>
> But if I'm wrong, please tell me.

You are mostly right, I think. Reverting bdd8b6c98239 fixes
it. There is another way to fix it, though. The patch proposed
by Jan Beulich also fixes the regression on my system, so as
the person reporting this is a regression, I would also be satisfied
with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
posted his proposed patch here:

https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/

The only reservation I have about Jan's patch is that the commit
message does not clearly explain how the patch changes what
the nopat kernel boot option does. It doesn't affect me because
I don't use nopat, but it should probably be mentioned in the
commit message, as pointed out here:

https://lore.kernel.org/lkml/bd9ed2c2-1337-27bb-c9da-dfc7b31d492c@netscape.net/

Whatever fix for the regression exposed by bdd8b6c98239 also
needs to be backported to the stable versions 5.17 and 5.18.

Regards,

Chuck Zmudzinski
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>
> P.S.: As the Linux kernel's regression tracker I deal with a lot of
> reports and sometimes miss something important when writing mails like
> this. If that's the case here, don't hesitate to tell me in a public
> reply, it's in everyone's interest to set the public record straight.
>
>> The i915 driver relies on the memory subsytem
>> to provide it with an accurate test for the existence of
>> X86_FEATURE_PAT. I think your patch provides that more accurate
>> test so that bdd8b6c98239 could be re-applied when your patch is
>> committed. Juergen's patch would have to touch bdd8b6c98239
>> with new functions that probably have unknown and unintended
>> consequences, so I think your approach is also better in that regard.
>> As regards your patch, there is just a disagreement about how the
>> i915 driver should behave if nopat is set. I agree the i915 driver
>> could do a better job handling that case, at least with better error
>> logs.
>>
>> Chuck
>>
>>>> I respond by saying a well-written driver should refuse to honor
>>>> the incorrect configuration requested by the user and instead
>>>> warn the user that it did not honor the incorrect kernel option.
>>>>
>>>> I am only presuming what your patch would do on my box based
>>>> on what I learned about this problem from my debugging. I can
>>>> also test your patch on my box to verify that my understanding of
>>>> it is correct.
>>>>
>>>> I also have not yet verified Juergen's patch will not fix it, but
>>>> I am almost certain it will not unless it is expanded so it also
>>>> touches i915_gem_object_pin_map() with the fix. I plan to test
>>>> his patch, but expanded so it touches that function also.
>>>>
>>>> I also plan to test your patch with and without nopat and report the
>>>> results in the thread where you posted your patch. Hopefully
>>>> by tomorrow I will have the results.
>>>>
>>>> Chuck


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-24 18:32                           ` Chuck Zmudzinski
  (?)
@ 2022-05-25  7:45                             ` Thorsten Leemhuis
  -1 siblings, 0 replies; 80+ messages in thread
From: Thorsten Leemhuis @ 2022-05-25  7:45 UTC (permalink / raw)
  To: Chuck Zmudzinski, Jan Beulich, regressions, stable
  Cc: Juergen Gross, Peter Zijlstra, intel-gfx, Dave Hansen, x86,
	linux-kernel, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, dri-devel, Andy Lutomirski, H. Peter Anvin,
	xen-devel, Thomas Gleixner



On 24.05.22 20:32, Chuck Zmudzinski wrote:
> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>> On 20.05.22 16:48, Chuck Zmudzinski wrote:
>>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>>> those want
>>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>>>> inspect them
>>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>>> observe the
>>>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>>>> as the
>>>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>>>> earlier
>>>>>>>>>>>> patch, in
>>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>>>> to be
>>>>>>>>>>>> the
>>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>>
>>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>>>> least
>>>>>>>>>>> the
>>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>>
>>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>>> should not override that,
>>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>>> an override would affect only the single domain where the
>>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>>
>>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>>> (but tell us "don't do that then").
>>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>>> That means your presumption (and the presumption of
>>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>>> having an effect on the driver).
>>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>>> i915 driver
>>>> The driver ought to work fine without PAT (and hence without being
>>>> able to make WC mappings). It would use UC instead and be slow, but
>>>> it ought to work.
>>>>
>>>>> and therefore the driver should refuse to disable
>>>>> it if the user requests to disable it and instead warn the user that
>>>>> the driver did not disable the feature, contrary to what the user
>>>>> requested with the nopat option.
>>>>>
>>>>> In any case, my test did not verify that when nopat is set in Linux
>>>>> 5.16,
>>>>> the thread takes the same code path as when nopat is not set,
>>>>> so I am not totally sure that the reason nopat does not break the
>>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>>> returns true even when nopat is set. I could test it with a custom
>>>>> log message in 5.16 if that is necessary.
>>>>>
>>>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>>>> to return true in 5.16 when the user requests nopat?
>>>> No, I'm not saying that. It was wrong for this construct to be used
>>>> in the driver, which was fixed for 5.17 (and which had caused the
>>>> regression I did observe, leading to the patch as a hopefully least
>>>> bad option).
>>>>
>>>>> I think that is
>>>>> just permitting a bad configuration to break the driver that a
>>>>> well-written operating system should not allow. The i915 driver
>>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>>> because that option is not compatible with the hardware the
>>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>>> that is my understanding now, but I will need to test it on 5.16
>>>>> to be sure I understand it correctly.
>>>>>
>>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>>> when nopat is not set on my box, so your patch would
>>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>>> case when nopat is set on my box. I think your point would
>>>>> be that it is not a regression if it is an incorrect user
>>>>> configuration.
>>>> Again no - my view is that there's a separate, pre-existing issue
>>>> in the driver which was uncovered by the change. This may be a
>>>> perceived regression, but is imo different from a real one.
>> Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
>> comes to the "no regressions rule". Just took a quick look at quotes
>> from Linus
>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>> and found this statement from Linus to back this up:
>>
>> ```
>> One _particularly_ last-minute revert is the top-most commit (ignoring
>> the version change itself) done just before the release, and while
>> it's very annoying, it's perhaps also instructive.
>>
>> What's instructive about it is that I reverted a commit that wasn't
>> actually buggy. In fact, it was doing exactly what it set out to do,
>> and did it very well. In fact it did it _so_ well that the much
>> improved IO patterns it caused then ended up revealing a user-visible
>> regression due to a real bug in a completely unrelated area.
>> ```
>>
>> He said that here:
>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>
>> The situation is of course different here, but similar enough.
>>
>>> Since it is a regression, I think for now bdd8b6c98239 should
>>> be reverted and the fix backported to Linux 5.17 stable until
>>> the underlying memory subsystem can provide the i915 driver
>>> with an updated test for the PAT feature that also meets the
>>> requirements of the author of bdd8b6c98239 without breaking
>>> the i915 driver.
>> I'm not a developer and I'm don't known the details of this thread and
>> the backstory of the regression, but it sounds like that's the approach
>> that is needed here until someone comes up with a fix for the regression
>> exposed by bdd8b6c98239.
>>
>> But if I'm wrong, please tell me.
> 
> You are mostly right, I think. Reverting bdd8b6c98239 fixes
> it. There is another way to fix it, though.

Yeah, I'm aware of it. But it seems...

> The patch proposed
> by Jan Beulich also fixes the regression on my system, so as
> the person reporting this is a regression, I would also be satisfied
> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
> posted his proposed patch here:
> 
> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/

...that approach is not making any progress either?

Jan, can could provide a short status update here? I'd really like to
get this regression fixed one way or another rather sooner than later,
as this is taken way to long already IMHO.

> The only reservation I have about Jan's patch is that the commit
> message does not clearly explain how the patch changes what
> the nopat kernel boot option does. It doesn't affect me because
> I don't use nopat, but it should probably be mentioned in the
> commit message, as pointed out here:
> 
> https://lore.kernel.org/lkml/bd9ed2c2-1337-27bb-c9da-dfc7b31d492c@netscape.net/
> 
> 
> Whatever fix for the regression exposed by bdd8b6c98239 also
> needs to be backported to the stable versions 5.17 and 5.18.

Sure.

BTW, as you seem to be familiar with the issue: there is another report
about a regression WRT to Xen and i915 (that is also not making really
progress):
https://lore.kernel.org/lkml/Yn%2FTgj1Ehs%2FBdpHp@itl-email/

It's just a wild guess, but bould this somehow be related?

Ciao, Thorsten

>>> The i915 driver relies on the memory subsytem
>>> to provide it with an accurate test for the existence of
>>> X86_FEATURE_PAT. I think your patch provides that more accurate
>>> test so that bdd8b6c98239 could be re-applied when your patch is
>>> committed. Juergen's patch would have to touch bdd8b6c98239
>>> with new functions that probably have unknown and unintended
>>> consequences, so I think your approach is also better in that regard.
>>> As regards your patch, there is just a disagreement about how the
>>> i915 driver should behave if nopat is set. I agree the i915 driver
>>> could do a better job handling that case, at least with better error
>>> logs.
>>>
>>> Chuck
>>>
>>>>> I respond by saying a well-written driver should refuse to honor
>>>>> the incorrect configuration requested by the user and instead
>>>>> warn the user that it did not honor the incorrect kernel option.
>>>>>
>>>>> I am only presuming what your patch would do on my box based
>>>>> on what I learned about this problem from my debugging. I can
>>>>> also test your patch on my box to verify that my understanding of
>>>>> it is correct.
>>>>>
>>>>> I also have not yet verified Juergen's patch will not fix it, but
>>>>> I am almost certain it will not unless it is expanded so it also
>>>>> touches i915_gem_object_pin_map() with the fix. I plan to test
>>>>> his patch, but expanded so it touches that function also.
>>>>>
>>>>> I also plan to test your patch with and without nopat and report the
>>>>> results in the thread where you posted your patch. Hopefully
>>>>> by tomorrow I will have the results.
>>>>>
>>>>> Chuck
> 
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-25  7:45                             ` Thorsten Leemhuis
  0 siblings, 0 replies; 80+ messages in thread
From: Thorsten Leemhuis @ 2022-05-25  7:45 UTC (permalink / raw)
  To: Chuck Zmudzinski, Jan Beulich, regressions, stable
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross



On 24.05.22 20:32, Chuck Zmudzinski wrote:
> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>> On 20.05.22 16:48, Chuck Zmudzinski wrote:
>>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>>> those want
>>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>>>> inspect them
>>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>>> observe the
>>>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>>>> as the
>>>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>>>> earlier
>>>>>>>>>>>> patch, in
>>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>>>> to be
>>>>>>>>>>>> the
>>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>>
>>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>>>> least
>>>>>>>>>>> the
>>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>>
>>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>>> should not override that,
>>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>>> an override would affect only the single domain where the
>>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>>
>>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>>> (but tell us "don't do that then").
>>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>>> That means your presumption (and the presumption of
>>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>>> having an effect on the driver).
>>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>>> i915 driver
>>>> The driver ought to work fine without PAT (and hence without being
>>>> able to make WC mappings). It would use UC instead and be slow, but
>>>> it ought to work.
>>>>
>>>>> and therefore the driver should refuse to disable
>>>>> it if the user requests to disable it and instead warn the user that
>>>>> the driver did not disable the feature, contrary to what the user
>>>>> requested with the nopat option.
>>>>>
>>>>> In any case, my test did not verify that when nopat is set in Linux
>>>>> 5.16,
>>>>> the thread takes the same code path as when nopat is not set,
>>>>> so I am not totally sure that the reason nopat does not break the
>>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>>> returns true even when nopat is set. I could test it with a custom
>>>>> log message in 5.16 if that is necessary.
>>>>>
>>>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>>>> to return true in 5.16 when the user requests nopat?
>>>> No, I'm not saying that. It was wrong for this construct to be used
>>>> in the driver, which was fixed for 5.17 (and which had caused the
>>>> regression I did observe, leading to the patch as a hopefully least
>>>> bad option).
>>>>
>>>>> I think that is
>>>>> just permitting a bad configuration to break the driver that a
>>>>> well-written operating system should not allow. The i915 driver
>>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>>> because that option is not compatible with the hardware the
>>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>>> that is my understanding now, but I will need to test it on 5.16
>>>>> to be sure I understand it correctly.
>>>>>
>>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>>> when nopat is not set on my box, so your patch would
>>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>>> case when nopat is set on my box. I think your point would
>>>>> be that it is not a regression if it is an incorrect user
>>>>> configuration.
>>>> Again no - my view is that there's a separate, pre-existing issue
>>>> in the driver which was uncovered by the change. This may be a
>>>> perceived regression, but is imo different from a real one.
>> Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
>> comes to the "no regressions rule". Just took a quick look at quotes
>> from Linus
>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>> and found this statement from Linus to back this up:
>>
>> ```
>> One _particularly_ last-minute revert is the top-most commit (ignoring
>> the version change itself) done just before the release, and while
>> it's very annoying, it's perhaps also instructive.
>>
>> What's instructive about it is that I reverted a commit that wasn't
>> actually buggy. In fact, it was doing exactly what it set out to do,
>> and did it very well. In fact it did it _so_ well that the much
>> improved IO patterns it caused then ended up revealing a user-visible
>> regression due to a real bug in a completely unrelated area.
>> ```
>>
>> He said that here:
>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>
>> The situation is of course different here, but similar enough.
>>
>>> Since it is a regression, I think for now bdd8b6c98239 should
>>> be reverted and the fix backported to Linux 5.17 stable until
>>> the underlying memory subsystem can provide the i915 driver
>>> with an updated test for the PAT feature that also meets the
>>> requirements of the author of bdd8b6c98239 without breaking
>>> the i915 driver.
>> I'm not a developer and I'm don't known the details of this thread and
>> the backstory of the regression, but it sounds like that's the approach
>> that is needed here until someone comes up with a fix for the regression
>> exposed by bdd8b6c98239.
>>
>> But if I'm wrong, please tell me.
> 
> You are mostly right, I think. Reverting bdd8b6c98239 fixes
> it. There is another way to fix it, though.

Yeah, I'm aware of it. But it seems...

> The patch proposed
> by Jan Beulich also fixes the regression on my system, so as
> the person reporting this is a regression, I would also be satisfied
> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
> posted his proposed patch here:
> 
> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/

...that approach is not making any progress either?

Jan, can could provide a short status update here? I'd really like to
get this regression fixed one way or another rather sooner than later,
as this is taken way to long already IMHO.

> The only reservation I have about Jan's patch is that the commit
> message does not clearly explain how the patch changes what
> the nopat kernel boot option does. It doesn't affect me because
> I don't use nopat, but it should probably be mentioned in the
> commit message, as pointed out here:
> 
> https://lore.kernel.org/lkml/bd9ed2c2-1337-27bb-c9da-dfc7b31d492c@netscape.net/
> 
> 
> Whatever fix for the regression exposed by bdd8b6c98239 also
> needs to be backported to the stable versions 5.17 and 5.18.

Sure.

BTW, as you seem to be familiar with the issue: there is another report
about a regression WRT to Xen and i915 (that is also not making really
progress):
https://lore.kernel.org/lkml/Yn%2FTgj1Ehs%2FBdpHp@itl-email/

It's just a wild guess, but bould this somehow be related?

Ciao, Thorsten

>>> The i915 driver relies on the memory subsytem
>>> to provide it with an accurate test for the existence of
>>> X86_FEATURE_PAT. I think your patch provides that more accurate
>>> test so that bdd8b6c98239 could be re-applied when your patch is
>>> committed. Juergen's patch would have to touch bdd8b6c98239
>>> with new functions that probably have unknown and unintended
>>> consequences, so I think your approach is also better in that regard.
>>> As regards your patch, there is just a disagreement about how the
>>> i915 driver should behave if nopat is set. I agree the i915 driver
>>> could do a better job handling that case, at least with better error
>>> logs.
>>>
>>> Chuck
>>>
>>>>> I respond by saying a well-written driver should refuse to honor
>>>>> the incorrect configuration requested by the user and instead
>>>>> warn the user that it did not honor the incorrect kernel option.
>>>>>
>>>>> I am only presuming what your patch would do on my box based
>>>>> on what I learned about this problem from my debugging. I can
>>>>> also test your patch on my box to verify that my understanding of
>>>>> it is correct.
>>>>>
>>>>> I also have not yet verified Juergen's patch will not fix it, but
>>>>> I am almost certain it will not unless it is expanded so it also
>>>>> touches i915_gem_object_pin_map() with the fix. I plan to test
>>>>> his patch, but expanded so it touches that function also.
>>>>>
>>>>> I also plan to test your patch with and without nopat and report the
>>>>> results in the thread where you posted your patch. Hopefully
>>>>> by tomorrow I will have the results.
>>>>>
>>>>> Chuck
> 
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-25  7:45                             ` Thorsten Leemhuis
  0 siblings, 0 replies; 80+ messages in thread
From: Thorsten Leemhuis @ 2022-05-25  7:45 UTC (permalink / raw)
  To: Chuck Zmudzinski, Jan Beulich, regressions, stable
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner



On 24.05.22 20:32, Chuck Zmudzinski wrote:
> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>> On 20.05.22 16:48, Chuck Zmudzinski wrote:
>>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>>> those want
>>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>>>> inspect them
>>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>>> observe the
>>>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>>>> as the
>>>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>>>> earlier
>>>>>>>>>>>> patch, in
>>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>>>> to be
>>>>>>>>>>>> the
>>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>>
>>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>>>> least
>>>>>>>>>>> the
>>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>>
>>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>>> should not override that,
>>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>>> an override would affect only the single domain where the
>>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>>
>>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>>> (but tell us "don't do that then").
>>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>>> That means your presumption (and the presumption of
>>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>>> having an effect on the driver).
>>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>>> i915 driver
>>>> The driver ought to work fine without PAT (and hence without being
>>>> able to make WC mappings). It would use UC instead and be slow, but
>>>> it ought to work.
>>>>
>>>>> and therefore the driver should refuse to disable
>>>>> it if the user requests to disable it and instead warn the user that
>>>>> the driver did not disable the feature, contrary to what the user
>>>>> requested with the nopat option.
>>>>>
>>>>> In any case, my test did not verify that when nopat is set in Linux
>>>>> 5.16,
>>>>> the thread takes the same code path as when nopat is not set,
>>>>> so I am not totally sure that the reason nopat does not break the
>>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>>> returns true even when nopat is set. I could test it with a custom
>>>>> log message in 5.16 if that is necessary.
>>>>>
>>>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>>>> to return true in 5.16 when the user requests nopat?
>>>> No, I'm not saying that. It was wrong for this construct to be used
>>>> in the driver, which was fixed for 5.17 (and which had caused the
>>>> regression I did observe, leading to the patch as a hopefully least
>>>> bad option).
>>>>
>>>>> I think that is
>>>>> just permitting a bad configuration to break the driver that a
>>>>> well-written operating system should not allow. The i915 driver
>>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>>> because that option is not compatible with the hardware the
>>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>>> that is my understanding now, but I will need to test it on 5.16
>>>>> to be sure I understand it correctly.
>>>>>
>>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>>> when nopat is not set on my box, so your patch would
>>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>>> case when nopat is set on my box. I think your point would
>>>>> be that it is not a regression if it is an incorrect user
>>>>> configuration.
>>>> Again no - my view is that there's a separate, pre-existing issue
>>>> in the driver which was uncovered by the change. This may be a
>>>> perceived regression, but is imo different from a real one.
>> Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
>> comes to the "no regressions rule". Just took a quick look at quotes
>> from Linus
>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>> and found this statement from Linus to back this up:
>>
>> ```
>> One _particularly_ last-minute revert is the top-most commit (ignoring
>> the version change itself) done just before the release, and while
>> it's very annoying, it's perhaps also instructive.
>>
>> What's instructive about it is that I reverted a commit that wasn't
>> actually buggy. In fact, it was doing exactly what it set out to do,
>> and did it very well. In fact it did it _so_ well that the much
>> improved IO patterns it caused then ended up revealing a user-visible
>> regression due to a real bug in a completely unrelated area.
>> ```
>>
>> He said that here:
>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>
>> The situation is of course different here, but similar enough.
>>
>>> Since it is a regression, I think for now bdd8b6c98239 should
>>> be reverted and the fix backported to Linux 5.17 stable until
>>> the underlying memory subsystem can provide the i915 driver
>>> with an updated test for the PAT feature that also meets the
>>> requirements of the author of bdd8b6c98239 without breaking
>>> the i915 driver.
>> I'm not a developer and I'm don't known the details of this thread and
>> the backstory of the regression, but it sounds like that's the approach
>> that is needed here until someone comes up with a fix for the regression
>> exposed by bdd8b6c98239.
>>
>> But if I'm wrong, please tell me.
> 
> You are mostly right, I think. Reverting bdd8b6c98239 fixes
> it. There is another way to fix it, though.

Yeah, I'm aware of it. But it seems...

> The patch proposed
> by Jan Beulich also fixes the regression on my system, so as
> the person reporting this is a regression, I would also be satisfied
> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
> posted his proposed patch here:
> 
> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/

...that approach is not making any progress either?

Jan, can could provide a short status update here? I'd really like to
get this regression fixed one way or another rather sooner than later,
as this is taken way to long already IMHO.

> The only reservation I have about Jan's patch is that the commit
> message does not clearly explain how the patch changes what
> the nopat kernel boot option does. It doesn't affect me because
> I don't use nopat, but it should probably be mentioned in the
> commit message, as pointed out here:
> 
> https://lore.kernel.org/lkml/bd9ed2c2-1337-27bb-c9da-dfc7b31d492c@netscape.net/
> 
> 
> Whatever fix for the regression exposed by bdd8b6c98239 also
> needs to be backported to the stable versions 5.17 and 5.18.

Sure.

BTW, as you seem to be familiar with the issue: there is another report
about a regression WRT to Xen and i915 (that is also not making really
progress):
https://lore.kernel.org/lkml/Yn%2FTgj1Ehs%2FBdpHp@itl-email/

It's just a wild guess, but bould this somehow be related?

Ciao, Thorsten

>>> The i915 driver relies on the memory subsytem
>>> to provide it with an accurate test for the existence of
>>> X86_FEATURE_PAT. I think your patch provides that more accurate
>>> test so that bdd8b6c98239 could be re-applied when your patch is
>>> committed. Juergen's patch would have to touch bdd8b6c98239
>>> with new functions that probably have unknown and unintended
>>> consequences, so I think your approach is also better in that regard.
>>> As regards your patch, there is just a disagreement about how the
>>> i915 driver should behave if nopat is set. I agree the i915 driver
>>> could do a better job handling that case, at least with better error
>>> logs.
>>>
>>> Chuck
>>>
>>>>> I respond by saying a well-written driver should refuse to honor
>>>>> the incorrect configuration requested by the user and instead
>>>>> warn the user that it did not honor the incorrect kernel option.
>>>>>
>>>>> I am only presuming what your patch would do on my box based
>>>>> on what I learned about this problem from my debugging. I can
>>>>> also test your patch on my box to verify that my understanding of
>>>>> it is correct.
>>>>>
>>>>> I also have not yet verified Juergen's patch will not fix it, but
>>>>> I am almost certain it will not unless it is expanded so it also
>>>>> touches i915_gem_object_pin_map() with the fix. I plan to test
>>>>> his patch, but expanded so it touches that function also.
>>>>>
>>>>> I also plan to test your patch with and without nopat and report the
>>>>> results in the thread where you posted your patch. Hopefully
>>>>> by tomorrow I will have the results.
>>>>>
>>>>> Chuck
> 
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-25  7:45                             ` Thorsten Leemhuis
  (?)
@ 2022-05-25  8:04                               ` Juergen Gross
  -1 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-25  8:04 UTC (permalink / raw)
  To: Thorsten Leemhuis, Chuck Zmudzinski, Jan Beulich, regressions, stable
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 10268 bytes --]

On 25.05.22 09:45, Thorsten Leemhuis wrote:
> 
> 
> On 24.05.22 20:32, Chuck Zmudzinski wrote:
>> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>>> On 20.05.22 16:48, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>>>> those want
>>>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>>>>> inspect them
>>>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>>>> observe the
>>>>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>>>>> as the
>>>>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>>>>> earlier
>>>>>>>>>>>>> patch, in
>>>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>>>>> to be
>>>>>>>>>>>>> the
>>>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>>>
>>>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>>>>> least
>>>>>>>>>>>> the
>>>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>>>
>>>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>>>> should not override that,
>>>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>>>> an override would affect only the single domain where the
>>>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>>>
>>>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>>>> (but tell us "don't do that then").
>>>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>>>> That means your presumption (and the presumption of
>>>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>>>> having an effect on the driver).
>>>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>>>> i915 driver
>>>>> The driver ought to work fine without PAT (and hence without being
>>>>> able to make WC mappings). It would use UC instead and be slow, but
>>>>> it ought to work.
>>>>>
>>>>>> and therefore the driver should refuse to disable
>>>>>> it if the user requests to disable it and instead warn the user that
>>>>>> the driver did not disable the feature, contrary to what the user
>>>>>> requested with the nopat option.
>>>>>>
>>>>>> In any case, my test did not verify that when nopat is set in Linux
>>>>>> 5.16,
>>>>>> the thread takes the same code path as when nopat is not set,
>>>>>> so I am not totally sure that the reason nopat does not break the
>>>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>>>> returns true even when nopat is set. I could test it with a custom
>>>>>> log message in 5.16 if that is necessary.
>>>>>>
>>>>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>>>>> to return true in 5.16 when the user requests nopat?
>>>>> No, I'm not saying that. It was wrong for this construct to be used
>>>>> in the driver, which was fixed for 5.17 (and which had caused the
>>>>> regression I did observe, leading to the patch as a hopefully least
>>>>> bad option).
>>>>>
>>>>>> I think that is
>>>>>> just permitting a bad configuration to break the driver that a
>>>>>> well-written operating system should not allow. The i915 driver
>>>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>>>> because that option is not compatible with the hardware the
>>>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>>>> that is my understanding now, but I will need to test it on 5.16
>>>>>> to be sure I understand it correctly.
>>>>>>
>>>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>>>> when nopat is not set on my box, so your patch would
>>>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>>>> case when nopat is set on my box. I think your point would
>>>>>> be that it is not a regression if it is an incorrect user
>>>>>> configuration.
>>>>> Again no - my view is that there's a separate, pre-existing issue
>>>>> in the driver which was uncovered by the change. This may be a
>>>>> perceived regression, but is imo different from a real one.
>>> Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
>>> comes to the "no regressions rule". Just took a quick look at quotes
>>> from Linus
>>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>> and found this statement from Linus to back this up:
>>>
>>> ```
>>> One _particularly_ last-minute revert is the top-most commit (ignoring
>>> the version change itself) done just before the release, and while
>>> it's very annoying, it's perhaps also instructive.
>>>
>>> What's instructive about it is that I reverted a commit that wasn't
>>> actually buggy. In fact, it was doing exactly what it set out to do,
>>> and did it very well. In fact it did it _so_ well that the much
>>> improved IO patterns it caused then ended up revealing a user-visible
>>> regression due to a real bug in a completely unrelated area.
>>> ```
>>>
>>> He said that here:
>>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>>
>>> The situation is of course different here, but similar enough.
>>>
>>>> Since it is a regression, I think for now bdd8b6c98239 should
>>>> be reverted and the fix backported to Linux 5.17 stable until
>>>> the underlying memory subsystem can provide the i915 driver
>>>> with an updated test for the PAT feature that also meets the
>>>> requirements of the author of bdd8b6c98239 without breaking
>>>> the i915 driver.
>>> I'm not a developer and I'm don't known the details of this thread and
>>> the backstory of the regression, but it sounds like that's the approach
>>> that is needed here until someone comes up with a fix for the regression
>>> exposed by bdd8b6c98239.
>>>
>>> But if I'm wrong, please tell me.
>>
>> You are mostly right, I think. Reverting bdd8b6c98239 fixes
>> it. There is another way to fix it, though.
> 
> Yeah, I'm aware of it. But it seems...
> 
>> The patch proposed
>> by Jan Beulich also fixes the regression on my system, so as
>> the person reporting this is a regression, I would also be satisfied
>> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
>> posted his proposed patch here:
>>
>> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/
> 
> ...that approach is not making any progress either?
> 
> Jan, can could provide a short status update here? I'd really like to
> get this regression fixed one way or another rather sooner than later,
> as this is taken way to long already IMHO.
> 
>> The only reservation I have about Jan's patch is that the commit
>> message does not clearly explain how the patch changes what
>> the nopat kernel boot option does. It doesn't affect me because
>> I don't use nopat, but it should probably be mentioned in the
>> commit message, as pointed out here:
>>
>> https://lore.kernel.org/lkml/bd9ed2c2-1337-27bb-c9da-dfc7b31d492c@netscape.net/
>>
>>
>> Whatever fix for the regression exposed by bdd8b6c98239 also
>> needs to be backported to the stable versions 5.17 and 5.18.
> 
> Sure.
> 
> BTW, as you seem to be familiar with the issue: there is another report
> about a regression WRT to Xen and i915 (that is also not making really
> progress):
> https://lore.kernel.org/lkml/Yn%2FTgj1Ehs%2FBdpHp@itl-email/
> 
> It's just a wild guess, but bould this somehow be related?

No, doesn't seem so.

I'm just reviewing the suggested fix:

https://lore.kernel.org/lkml/Yo0LwmVUDSBZb44K@itl-email/


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-25  8:04                               ` Juergen Gross
  0 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-25  8:04 UTC (permalink / raw)
  To: Thorsten Leemhuis, Chuck Zmudzinski, Jan Beulich, regressions, stable
  Cc: Tvrtko Ursulin, Peter Zijlstra, intel-gfx, Dave Hansen, x86,
	linux-kernel, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, dri-devel, Andy Lutomirski, H. Peter Anvin,
	xen-devel, Thomas Gleixner


[-- Attachment #1.1.1: Type: text/plain, Size: 10268 bytes --]

On 25.05.22 09:45, Thorsten Leemhuis wrote:
> 
> 
> On 24.05.22 20:32, Chuck Zmudzinski wrote:
>> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>>> On 20.05.22 16:48, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>>>> those want
>>>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>>>>> inspect them
>>>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>>>> observe the
>>>>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>>>>> as the
>>>>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>>>>> earlier
>>>>>>>>>>>>> patch, in
>>>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>>>>> to be
>>>>>>>>>>>>> the
>>>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>>>
>>>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>>>>> least
>>>>>>>>>>>> the
>>>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>>>
>>>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>>>> should not override that,
>>>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>>>> an override would affect only the single domain where the
>>>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>>>
>>>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>>>> (but tell us "don't do that then").
>>>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>>>> That means your presumption (and the presumption of
>>>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>>>> having an effect on the driver).
>>>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>>>> i915 driver
>>>>> The driver ought to work fine without PAT (and hence without being
>>>>> able to make WC mappings). It would use UC instead and be slow, but
>>>>> it ought to work.
>>>>>
>>>>>> and therefore the driver should refuse to disable
>>>>>> it if the user requests to disable it and instead warn the user that
>>>>>> the driver did not disable the feature, contrary to what the user
>>>>>> requested with the nopat option.
>>>>>>
>>>>>> In any case, my test did not verify that when nopat is set in Linux
>>>>>> 5.16,
>>>>>> the thread takes the same code path as when nopat is not set,
>>>>>> so I am not totally sure that the reason nopat does not break the
>>>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>>>> returns true even when nopat is set. I could test it with a custom
>>>>>> log message in 5.16 if that is necessary.
>>>>>>
>>>>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>>>>> to return true in 5.16 when the user requests nopat?
>>>>> No, I'm not saying that. It was wrong for this construct to be used
>>>>> in the driver, which was fixed for 5.17 (and which had caused the
>>>>> regression I did observe, leading to the patch as a hopefully least
>>>>> bad option).
>>>>>
>>>>>> I think that is
>>>>>> just permitting a bad configuration to break the driver that a
>>>>>> well-written operating system should not allow. The i915 driver
>>>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>>>> because that option is not compatible with the hardware the
>>>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>>>> that is my understanding now, but I will need to test it on 5.16
>>>>>> to be sure I understand it correctly.
>>>>>>
>>>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>>>> when nopat is not set on my box, so your patch would
>>>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>>>> case when nopat is set on my box. I think your point would
>>>>>> be that it is not a regression if it is an incorrect user
>>>>>> configuration.
>>>>> Again no - my view is that there's a separate, pre-existing issue
>>>>> in the driver which was uncovered by the change. This may be a
>>>>> perceived regression, but is imo different from a real one.
>>> Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
>>> comes to the "no regressions rule". Just took a quick look at quotes
>>> from Linus
>>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>> and found this statement from Linus to back this up:
>>>
>>> ```
>>> One _particularly_ last-minute revert is the top-most commit (ignoring
>>> the version change itself) done just before the release, and while
>>> it's very annoying, it's perhaps also instructive.
>>>
>>> What's instructive about it is that I reverted a commit that wasn't
>>> actually buggy. In fact, it was doing exactly what it set out to do,
>>> and did it very well. In fact it did it _so_ well that the much
>>> improved IO patterns it caused then ended up revealing a user-visible
>>> regression due to a real bug in a completely unrelated area.
>>> ```
>>>
>>> He said that here:
>>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>>
>>> The situation is of course different here, but similar enough.
>>>
>>>> Since it is a regression, I think for now bdd8b6c98239 should
>>>> be reverted and the fix backported to Linux 5.17 stable until
>>>> the underlying memory subsystem can provide the i915 driver
>>>> with an updated test for the PAT feature that also meets the
>>>> requirements of the author of bdd8b6c98239 without breaking
>>>> the i915 driver.
>>> I'm not a developer and I'm don't known the details of this thread and
>>> the backstory of the regression, but it sounds like that's the approach
>>> that is needed here until someone comes up with a fix for the regression
>>> exposed by bdd8b6c98239.
>>>
>>> But if I'm wrong, please tell me.
>>
>> You are mostly right, I think. Reverting bdd8b6c98239 fixes
>> it. There is another way to fix it, though.
> 
> Yeah, I'm aware of it. But it seems...
> 
>> The patch proposed
>> by Jan Beulich also fixes the regression on my system, so as
>> the person reporting this is a regression, I would also be satisfied
>> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
>> posted his proposed patch here:
>>
>> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/
> 
> ...that approach is not making any progress either?
> 
> Jan, can could provide a short status update here? I'd really like to
> get this regression fixed one way or another rather sooner than later,
> as this is taken way to long already IMHO.
> 
>> The only reservation I have about Jan's patch is that the commit
>> message does not clearly explain how the patch changes what
>> the nopat kernel boot option does. It doesn't affect me because
>> I don't use nopat, but it should probably be mentioned in the
>> commit message, as pointed out here:
>>
>> https://lore.kernel.org/lkml/bd9ed2c2-1337-27bb-c9da-dfc7b31d492c@netscape.net/
>>
>>
>> Whatever fix for the regression exposed by bdd8b6c98239 also
>> needs to be backported to the stable versions 5.17 and 5.18.
> 
> Sure.
> 
> BTW, as you seem to be familiar with the issue: there is another report
> about a regression WRT to Xen and i915 (that is also not making really
> progress):
> https://lore.kernel.org/lkml/Yn%2FTgj1Ehs%2FBdpHp@itl-email/
> 
> It's just a wild guess, but bould this somehow be related?

No, doesn't seem so.

I'm just reviewing the suggested fix:

https://lore.kernel.org/lkml/Yo0LwmVUDSBZb44K@itl-email/


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-25  8:04                               ` Juergen Gross
  0 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-25  8:04 UTC (permalink / raw)
  To: Thorsten Leemhuis, Chuck Zmudzinski, Jan Beulich, regressions, stable
  Cc: Peter Zijlstra, intel-gfx, Dave Hansen, x86, linux-kernel,
	David Airlie, Rodrigo Vivi, Ingo Molnar, Borislav Petkov,
	dri-devel, Andy Lutomirski, H. Peter Anvin, xen-devel,
	Thomas Gleixner


[-- Attachment #1.1.1: Type: text/plain, Size: 10268 bytes --]

On 25.05.22 09:45, Thorsten Leemhuis wrote:
> 
> 
> On 24.05.22 20:32, Chuck Zmudzinski wrote:
>> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>>> On 20.05.22 16:48, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>>>> those want
>>>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>>>>> inspect them
>>>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>>>> observe the
>>>>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>>>>> as the
>>>>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>>>>> earlier
>>>>>>>>>>>>> patch, in
>>>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>>>>> to be
>>>>>>>>>>>>> the
>>>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>>>
>>>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>>>>> least
>>>>>>>>>>>> the
>>>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>>>
>>>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>>>> should not override that,
>>>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>>>> an override would affect only the single domain where the
>>>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>>>
>>>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>>>> (but tell us "don't do that then").
>>>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>>>> That means your presumption (and the presumption of
>>>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>>>> having an effect on the driver).
>>>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>>>> i915 driver
>>>>> The driver ought to work fine without PAT (and hence without being
>>>>> able to make WC mappings). It would use UC instead and be slow, but
>>>>> it ought to work.
>>>>>
>>>>>> and therefore the driver should refuse to disable
>>>>>> it if the user requests to disable it and instead warn the user that
>>>>>> the driver did not disable the feature, contrary to what the user
>>>>>> requested with the nopat option.
>>>>>>
>>>>>> In any case, my test did not verify that when nopat is set in Linux
>>>>>> 5.16,
>>>>>> the thread takes the same code path as when nopat is not set,
>>>>>> so I am not totally sure that the reason nopat does not break the
>>>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>>>> returns true even when nopat is set. I could test it with a custom
>>>>>> log message in 5.16 if that is necessary.
>>>>>>
>>>>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>>>>> to return true in 5.16 when the user requests nopat?
>>>>> No, I'm not saying that. It was wrong for this construct to be used
>>>>> in the driver, which was fixed for 5.17 (and which had caused the
>>>>> regression I did observe, leading to the patch as a hopefully least
>>>>> bad option).
>>>>>
>>>>>> I think that is
>>>>>> just permitting a bad configuration to break the driver that a
>>>>>> well-written operating system should not allow. The i915 driver
>>>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>>>> because that option is not compatible with the hardware the
>>>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>>>> that is my understanding now, but I will need to test it on 5.16
>>>>>> to be sure I understand it correctly.
>>>>>>
>>>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>>>> when nopat is not set on my box, so your patch would
>>>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>>>> case when nopat is set on my box. I think your point would
>>>>>> be that it is not a regression if it is an incorrect user
>>>>>> configuration.
>>>>> Again no - my view is that there's a separate, pre-existing issue
>>>>> in the driver which was uncovered by the change. This may be a
>>>>> perceived regression, but is imo different from a real one.
>>> Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
>>> comes to the "no regressions rule". Just took a quick look at quotes
>>> from Linus
>>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>> and found this statement from Linus to back this up:
>>>
>>> ```
>>> One _particularly_ last-minute revert is the top-most commit (ignoring
>>> the version change itself) done just before the release, and while
>>> it's very annoying, it's perhaps also instructive.
>>>
>>> What's instructive about it is that I reverted a commit that wasn't
>>> actually buggy. In fact, it was doing exactly what it set out to do,
>>> and did it very well. In fact it did it _so_ well that the much
>>> improved IO patterns it caused then ended up revealing a user-visible
>>> regression due to a real bug in a completely unrelated area.
>>> ```
>>>
>>> He said that here:
>>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>>
>>> The situation is of course different here, but similar enough.
>>>
>>>> Since it is a regression, I think for now bdd8b6c98239 should
>>>> be reverted and the fix backported to Linux 5.17 stable until
>>>> the underlying memory subsystem can provide the i915 driver
>>>> with an updated test for the PAT feature that also meets the
>>>> requirements of the author of bdd8b6c98239 without breaking
>>>> the i915 driver.
>>> I'm not a developer and I'm don't known the details of this thread and
>>> the backstory of the regression, but it sounds like that's the approach
>>> that is needed here until someone comes up with a fix for the regression
>>> exposed by bdd8b6c98239.
>>>
>>> But if I'm wrong, please tell me.
>>
>> You are mostly right, I think. Reverting bdd8b6c98239 fixes
>> it. There is another way to fix it, though.
> 
> Yeah, I'm aware of it. But it seems...
> 
>> The patch proposed
>> by Jan Beulich also fixes the regression on my system, so as
>> the person reporting this is a regression, I would also be satisfied
>> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
>> posted his proposed patch here:
>>
>> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/
> 
> ...that approach is not making any progress either?
> 
> Jan, can could provide a short status update here? I'd really like to
> get this regression fixed one way or another rather sooner than later,
> as this is taken way to long already IMHO.
> 
>> The only reservation I have about Jan's patch is that the commit
>> message does not clearly explain how the patch changes what
>> the nopat kernel boot option does. It doesn't affect me because
>> I don't use nopat, but it should probably be mentioned in the
>> commit message, as pointed out here:
>>
>> https://lore.kernel.org/lkml/bd9ed2c2-1337-27bb-c9da-dfc7b31d492c@netscape.net/
>>
>>
>> Whatever fix for the regression exposed by bdd8b6c98239 also
>> needs to be backported to the stable versions 5.17 and 5.18.
> 
> Sure.
> 
> BTW, as you seem to be familiar with the issue: there is another report
> about a regression WRT to Xen and i915 (that is also not making really
> progress):
> https://lore.kernel.org/lkml/Yn%2FTgj1Ehs%2FBdpHp@itl-email/
> 
> It's just a wild guess, but bould this somehow be related?

No, doesn't seem so.

I'm just reviewing the suggested fix:

https://lore.kernel.org/lkml/Yo0LwmVUDSBZb44K@itl-email/


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-25  7:45                             ` Thorsten Leemhuis
@ 2022-05-25  8:37                               ` Jan Beulich
  -1 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-25  8:37 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross, Chuck Zmudzinski, regressions, stable

On 25.05.2022 09:45, Thorsten Leemhuis wrote:
> On 24.05.22 20:32, Chuck Zmudzinski wrote:
>> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>>> I'm not a developer and I'm don't known the details of this thread and
>>> the backstory of the regression, but it sounds like that's the approach
>>> that is needed here until someone comes up with a fix for the regression
>>> exposed by bdd8b6c98239.
>>>
>>> But if I'm wrong, please tell me.
>>
>> You are mostly right, I think. Reverting bdd8b6c98239 fixes
>> it. There is another way to fix it, though.
> 
> Yeah, I'm aware of it. But it seems...
> 
>> The patch proposed
>> by Jan Beulich also fixes the regression on my system, so as
>> the person reporting this is a regression, I would also be satisfied
>> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
>> posted his proposed patch here:
>>
>> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/
> 
> ...that approach is not making any progress either?
> 
> Jan, can could provide a short status update here? I'd really like to
> get this regression fixed one way or another rather sooner than later,
> as this is taken way to long already IMHO.

What kind of status update could I provide? I've not heard back from
anyone of the maintainers, so I have no way to know what (if anything)
I need to do.

Jan


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-25  8:37                               ` Jan Beulich
  0 siblings, 0 replies; 80+ messages in thread
From: Jan Beulich @ 2022-05-25  8:37 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: regressions, Peter Zijlstra, dri-devel, H. Peter Anvin, x86,
	David Airlie, Ingo Molnar, Chuck Zmudzinski, xen-devel,
	Dave Hansen, intel-gfx, Borislav Petkov, Andy Lutomirski,
	Rodrigo Vivi, Thomas Gleixner, Juergen Gross, Tvrtko Ursulin,
	linux-kernel, stable

On 25.05.2022 09:45, Thorsten Leemhuis wrote:
> On 24.05.22 20:32, Chuck Zmudzinski wrote:
>> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>>> I'm not a developer and I'm don't known the details of this thread and
>>> the backstory of the regression, but it sounds like that's the approach
>>> that is needed here until someone comes up with a fix for the regression
>>> exposed by bdd8b6c98239.
>>>
>>> But if I'm wrong, please tell me.
>>
>> You are mostly right, I think. Reverting bdd8b6c98239 fixes
>> it. There is another way to fix it, though.
> 
> Yeah, I'm aware of it. But it seems...
> 
>> The patch proposed
>> by Jan Beulich also fixes the regression on my system, so as
>> the person reporting this is a regression, I would also be satisfied
>> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
>> posted his proposed patch here:
>>
>> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/
> 
> ...that approach is not making any progress either?
> 
> Jan, can could provide a short status update here? I'd really like to
> get this regression fixed one way or another rather sooner than later,
> as this is taken way to long already IMHO.

What kind of status update could I provide? I've not heard back from
anyone of the maintainers, so I have no way to know what (if anything)
I need to do.

Jan


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-25  8:37                               ` Jan Beulich
  (?)
@ 2022-05-25  8:51                                 ` Thorsten Leemhuis
  -1 siblings, 0 replies; 80+ messages in thread
From: Thorsten Leemhuis @ 2022-05-25  8:51 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross, Chuck Zmudzinski, regressions, stable

On 25.05.22 10:37, Jan Beulich wrote:
> On 25.05.2022 09:45, Thorsten Leemhuis wrote:
>> On 24.05.22 20:32, Chuck Zmudzinski wrote:
>>> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>>>> I'm not a developer and I'm don't known the details of this thread and
>>>> the backstory of the regression, but it sounds like that's the approach
>>>> that is needed here until someone comes up with a fix for the regression
>>>> exposed by bdd8b6c98239.
>>>>
>>>> But if I'm wrong, please tell me.
>>>
>>> You are mostly right, I think. Reverting bdd8b6c98239 fixes
>>> it. There is another way to fix it, though.
>>
>> Yeah, I'm aware of it. But it seems...
>>
>>> The patch proposed
>>> by Jan Beulich also fixes the regression on my system, so as
>>> the person reporting this is a regression, I would also be satisfied
>>> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
>>> posted his proposed patch here:
>>>
>>> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/
>>
>> ...that approach is not making any progress either?
>>
>> Jan, can could provide a short status update here? I'd really like to
>> get this regression fixed one way or another rather sooner than later,
>> as this is taken way to long already IMHO.
> 
> What kind of status update could I provide? I've not heard back from
> anyone of the maintainers, so I have no way to know what (if anything)
> I need to do.

That is perfectly fine as a status update for me (I track a lot of
regression and it's easy to miss updated patches, discussion in other
places, and things like that).

Could you maybe send a reminder to the maintainer that this is a fix for
regression that is bothering people and needs to be handled with high
priority? Feel free to tell them the Linux kernel regression tracker is
pestering you because things are taken so long. :-D

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-25  8:51                                 ` Thorsten Leemhuis
  0 siblings, 0 replies; 80+ messages in thread
From: Thorsten Leemhuis @ 2022-05-25  8:51 UTC (permalink / raw)
  To: Jan Beulich
  Cc: regressions, Peter Zijlstra, dri-devel, H. Peter Anvin, x86,
	David Airlie, Ingo Molnar, Chuck Zmudzinski, xen-devel,
	Dave Hansen, intel-gfx, Borislav Petkov, Andy Lutomirski,
	Rodrigo Vivi, Thomas Gleixner, Juergen Gross, Tvrtko Ursulin,
	linux-kernel, stable

On 25.05.22 10:37, Jan Beulich wrote:
> On 25.05.2022 09:45, Thorsten Leemhuis wrote:
>> On 24.05.22 20:32, Chuck Zmudzinski wrote:
>>> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>>>> I'm not a developer and I'm don't known the details of this thread and
>>>> the backstory of the regression, but it sounds like that's the approach
>>>> that is needed here until someone comes up with a fix for the regression
>>>> exposed by bdd8b6c98239.
>>>>
>>>> But if I'm wrong, please tell me.
>>>
>>> You are mostly right, I think. Reverting bdd8b6c98239 fixes
>>> it. There is another way to fix it, though.
>>
>> Yeah, I'm aware of it. But it seems...
>>
>>> The patch proposed
>>> by Jan Beulich also fixes the regression on my system, so as
>>> the person reporting this is a regression, I would also be satisfied
>>> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
>>> posted his proposed patch here:
>>>
>>> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/
>>
>> ...that approach is not making any progress either?
>>
>> Jan, can could provide a short status update here? I'd really like to
>> get this regression fixed one way or another rather sooner than later,
>> as this is taken way to long already IMHO.
> 
> What kind of status update could I provide? I've not heard back from
> anyone of the maintainers, so I have no way to know what (if anything)
> I need to do.

That is perfectly fine as a status update for me (I track a lot of
regression and it's easy to miss updated patches, discussion in other
places, and things like that).

Could you maybe send a reminder to the maintainer that this is a fix for
regression that is bothering people and needs to be handled with high
priority? Feel free to tell them the Linux kernel regression tracker is
pestering you because things are taken so long. :-D

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-25  8:51                                 ` Thorsten Leemhuis
  0 siblings, 0 replies; 80+ messages in thread
From: Thorsten Leemhuis @ 2022-05-25  8:51 UTC (permalink / raw)
  To: Jan Beulich
  Cc: regressions, Peter Zijlstra, dri-devel, H. Peter Anvin, x86,
	David Airlie, Ingo Molnar, Chuck Zmudzinski, xen-devel,
	Dave Hansen, intel-gfx, Borislav Petkov, Andy Lutomirski,
	Rodrigo Vivi, Thomas Gleixner, Juergen Gross, linux-kernel,
	stable

On 25.05.22 10:37, Jan Beulich wrote:
> On 25.05.2022 09:45, Thorsten Leemhuis wrote:
>> On 24.05.22 20:32, Chuck Zmudzinski wrote:
>>> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>>>> I'm not a developer and I'm don't known the details of this thread and
>>>> the backstory of the regression, but it sounds like that's the approach
>>>> that is needed here until someone comes up with a fix for the regression
>>>> exposed by bdd8b6c98239.
>>>>
>>>> But if I'm wrong, please tell me.
>>>
>>> You are mostly right, I think. Reverting bdd8b6c98239 fixes
>>> it. There is another way to fix it, though.
>>
>> Yeah, I'm aware of it. But it seems...
>>
>>> The patch proposed
>>> by Jan Beulich also fixes the regression on my system, so as
>>> the person reporting this is a regression, I would also be satisfied
>>> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
>>> posted his proposed patch here:
>>>
>>> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/
>>
>> ...that approach is not making any progress either?
>>
>> Jan, can could provide a short status update here? I'd really like to
>> get this regression fixed one way or another rather sooner than later,
>> as this is taken way to long already IMHO.
> 
> What kind of status update could I provide? I've not heard back from
> anyone of the maintainers, so I have no way to know what (if anything)
> I need to do.

That is perfectly fine as a status update for me (I track a lot of
regression and it's easy to miss updated patches, discussion in other
places, and things like that).

Could you maybe send a reminder to the maintainer that this is a fix for
regression that is bothering people and needs to be handled with high
priority? Feel free to tell them the Linux kernel regression tracker is
pestering you because things are taken so long. :-D

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
  2022-05-25  7:45                             ` Thorsten Leemhuis
  (?)
@ 2022-05-25 19:25                               ` Chuck Zmudzinski
  -1 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-25 19:25 UTC (permalink / raw)
  To: Thorsten Leemhuis, Jan Beulich, regressions, stable
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Andy Lutomirski, Peter Zijlstra, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, xen-devel, x86, linux-kernel, intel-gfx,
	dri-devel, Juergen Gross

On 5/25/2022 3:45 AM, Thorsten Leemhuis wrote:
> On 24.05.22 20:32, Chuck Zmudzinski wrote:
>> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>>> On 20.05.22 16:48, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>>>> those want
>>>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>>>>> inspect them
>>>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>>>> observe the
>>>>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>>>>> as the
>>>>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>>>>> earlier
>>>>>>>>>>>>> patch, in
>>>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>>>>> to be
>>>>>>>>>>>>> the
>>>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>>>
>>>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>>>>> least
>>>>>>>>>>>> the
>>>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>>>
>>>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>>>> should not override that,
>>>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>>>> an override would affect only the single domain where the
>>>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>>>
>>>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>>>> (but tell us "don't do that then").
>>>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>>>> That means your presumption (and the presumption of
>>>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>>>> having an effect on the driver).
>>>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>>>> i915 driver
>>>>> The driver ought to work fine without PAT (and hence without being
>>>>> able to make WC mappings). It would use UC instead and be slow, but
>>>>> it ought to work.
>>>>>
>>>>>> and therefore the driver should refuse to disable
>>>>>> it if the user requests to disable it and instead warn the user that
>>>>>> the driver did not disable the feature, contrary to what the user
>>>>>> requested with the nopat option.
>>>>>>
>>>>>> In any case, my test did not verify that when nopat is set in Linux
>>>>>> 5.16,
>>>>>> the thread takes the same code path as when nopat is not set,
>>>>>> so I am not totally sure that the reason nopat does not break the
>>>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>>>> returns true even when nopat is set. I could test it with a custom
>>>>>> log message in 5.16 if that is necessary.
>>>>>>
>>>>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>>>>> to return true in 5.16 when the user requests nopat?
>>>>> No, I'm not saying that. It was wrong for this construct to be used
>>>>> in the driver, which was fixed for 5.17 (and which had caused the
>>>>> regression I did observe, leading to the patch as a hopefully least
>>>>> bad option).
>>>>>
>>>>>> I think that is
>>>>>> just permitting a bad configuration to break the driver that a
>>>>>> well-written operating system should not allow. The i915 driver
>>>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>>>> because that option is not compatible with the hardware the
>>>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>>>> that is my understanding now, but I will need to test it on 5.16
>>>>>> to be sure I understand it correctly.
>>>>>>
>>>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>>>> when nopat is not set on my box, so your patch would
>>>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>>>> case when nopat is set on my box. I think your point would
>>>>>> be that it is not a regression if it is an incorrect user
>>>>>> configuration.
>>>>> Again no - my view is that there's a separate, pre-existing issue
>>>>> in the driver which was uncovered by the change. This may be a
>>>>> perceived regression, but is imo different from a real one.
>>> Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
>>> comes to the "no regressions rule". Just took a quick look at quotes
>>> from Linus
>>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>> and found this statement from Linus to back this up:
>>>
>>> ```
>>> One _particularly_ last-minute revert is the top-most commit (ignoring
>>> the version change itself) done just before the release, and while
>>> it's very annoying, it's perhaps also instructive.
>>>
>>> What's instructive about it is that I reverted a commit that wasn't
>>> actually buggy. In fact, it was doing exactly what it set out to do,
>>> and did it very well. In fact it did it _so_ well that the much
>>> improved IO patterns it caused then ended up revealing a user-visible
>>> regression due to a real bug in a completely unrelated area.
>>> ```
>>>
>>> He said that here:
>>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>>
>>> The situation is of course different here, but similar enough.
>>>
>>>> Since it is a regression, I think for now bdd8b6c98239 should
>>>> be reverted and the fix backported to Linux 5.17 stable until
>>>> the underlying memory subsystem can provide the i915 driver
>>>> with an updated test for the PAT feature that also meets the
>>>> requirements of the author of bdd8b6c98239 without breaking
>>>> the i915 driver.
>>> I'm not a developer and I'm don't known the details of this thread and
>>> the backstory of the regression, but it sounds like that's the approach
>>> that is needed here until someone comes up with a fix for the regression
>>> exposed by bdd8b6c98239.
>>>
>>> But if I'm wrong, please tell me.
>> You are mostly right, I think. Reverting bdd8b6c98239 fixes
>> it. There is another way to fix it, though.
> Yeah, I'm aware of it. But it seems...
>
>> The patch proposed
>> by Jan Beulich also fixes the regression on my system, so as
>> the person reporting this is a regression, I would also be satisfied
>> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
>> posted his proposed patch here:
>>
>> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/
> ...that approach is not making any progress either?

Jan's approach does fix it on my system. There was some debate
about what the kernel nopat option should do, though. I don't
have a strong opinion on that and would accept Jan's patch
as a fix.

>
> Jan, can could provide a short status update here? I'd really like to
> get this regression fixed one way or another rather sooner than later,
> as this is taken way to long already IMHO.

I hope something is done soon also.

>
>> The only reservation I have about Jan's patch is that the commit
>> message does not clearly explain how the patch changes what
>> the nopat kernel boot option does. It doesn't affect me because
>> I don't use nopat, but it should probably be mentioned in the
>> commit message, as pointed out here:
>>
>> https://lore.kernel.org/lkml/bd9ed2c2-1337-27bb-c9da-dfc7b31d492c@netscape.net/
>>
>>
>> Whatever fix for the regression exposed by bdd8b6c98239 also
>> needs to be backported to the stable versions 5.17 and 5.18.
> Sure.
>
> BTW, as you seem to be familiar with the issue: there is another report
> about a regression WRT to Xen and i915 (that is also not making really
> progress):
> https://lore.kernel.org/lkml/Yn%2FTgj1Ehs%2FBdpHp@itl-email/
>
> It's just a wild guess, but bould this somehow be related?

It could be, but I do not run a GUI in my Xen Dom0, so I have not
seen that issue.

Best regards,

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-25 19:25                               ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-25 19:25 UTC (permalink / raw)
  To: Thorsten Leemhuis, Jan Beulich, regressions, stable
  Cc: Juergen Gross, Tvrtko Ursulin, Peter Zijlstra, intel-gfx,
	Dave Hansen, x86, linux-kernel, David Airlie, Rodrigo Vivi,
	Ingo Molnar, Borislav Petkov, dri-devel, Andy Lutomirski,
	H. Peter Anvin, xen-devel, Thomas Gleixner

On 5/25/2022 3:45 AM, Thorsten Leemhuis wrote:
> On 24.05.22 20:32, Chuck Zmudzinski wrote:
>> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>>> On 20.05.22 16:48, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>>>> those want
>>>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>>>>> inspect them
>>>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>>>> observe the
>>>>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>>>>> as the
>>>>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>>>>> earlier
>>>>>>>>>>>>> patch, in
>>>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>>>>> to be
>>>>>>>>>>>>> the
>>>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>>>
>>>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>>>>> least
>>>>>>>>>>>> the
>>>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>>>
>>>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>>>> should not override that,
>>>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>>>> an override would affect only the single domain where the
>>>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>>>
>>>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>>>> (but tell us "don't do that then").
>>>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>>>> That means your presumption (and the presumption of
>>>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>>>> having an effect on the driver).
>>>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>>>> i915 driver
>>>>> The driver ought to work fine without PAT (and hence without being
>>>>> able to make WC mappings). It would use UC instead and be slow, but
>>>>> it ought to work.
>>>>>
>>>>>> and therefore the driver should refuse to disable
>>>>>> it if the user requests to disable it and instead warn the user that
>>>>>> the driver did not disable the feature, contrary to what the user
>>>>>> requested with the nopat option.
>>>>>>
>>>>>> In any case, my test did not verify that when nopat is set in Linux
>>>>>> 5.16,
>>>>>> the thread takes the same code path as when nopat is not set,
>>>>>> so I am not totally sure that the reason nopat does not break the
>>>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>>>> returns true even when nopat is set. I could test it with a custom
>>>>>> log message in 5.16 if that is necessary.
>>>>>>
>>>>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>>>>> to return true in 5.16 when the user requests nopat?
>>>>> No, I'm not saying that. It was wrong for this construct to be used
>>>>> in the driver, which was fixed for 5.17 (and which had caused the
>>>>> regression I did observe, leading to the patch as a hopefully least
>>>>> bad option).
>>>>>
>>>>>> I think that is
>>>>>> just permitting a bad configuration to break the driver that a
>>>>>> well-written operating system should not allow. The i915 driver
>>>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>>>> because that option is not compatible with the hardware the
>>>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>>>> that is my understanding now, but I will need to test it on 5.16
>>>>>> to be sure I understand it correctly.
>>>>>>
>>>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>>>> when nopat is not set on my box, so your patch would
>>>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>>>> case when nopat is set on my box. I think your point would
>>>>>> be that it is not a regression if it is an incorrect user
>>>>>> configuration.
>>>>> Again no - my view is that there's a separate, pre-existing issue
>>>>> in the driver which was uncovered by the change. This may be a
>>>>> perceived regression, but is imo different from a real one.
>>> Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
>>> comes to the "no regressions rule". Just took a quick look at quotes
>>> from Linus
>>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>> and found this statement from Linus to back this up:
>>>
>>> ```
>>> One _particularly_ last-minute revert is the top-most commit (ignoring
>>> the version change itself) done just before the release, and while
>>> it's very annoying, it's perhaps also instructive.
>>>
>>> What's instructive about it is that I reverted a commit that wasn't
>>> actually buggy. In fact, it was doing exactly what it set out to do,
>>> and did it very well. In fact it did it _so_ well that the much
>>> improved IO patterns it caused then ended up revealing a user-visible
>>> regression due to a real bug in a completely unrelated area.
>>> ```
>>>
>>> He said that here:
>>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>>
>>> The situation is of course different here, but similar enough.
>>>
>>>> Since it is a regression, I think for now bdd8b6c98239 should
>>>> be reverted and the fix backported to Linux 5.17 stable until
>>>> the underlying memory subsystem can provide the i915 driver
>>>> with an updated test for the PAT feature that also meets the
>>>> requirements of the author of bdd8b6c98239 without breaking
>>>> the i915 driver.
>>> I'm not a developer and I'm don't known the details of this thread and
>>> the backstory of the regression, but it sounds like that's the approach
>>> that is needed here until someone comes up with a fix for the regression
>>> exposed by bdd8b6c98239.
>>>
>>> But if I'm wrong, please tell me.
>> You are mostly right, I think. Reverting bdd8b6c98239 fixes
>> it. There is another way to fix it, though.
> Yeah, I'm aware of it. But it seems...
>
>> The patch proposed
>> by Jan Beulich also fixes the regression on my system, so as
>> the person reporting this is a regression, I would also be satisfied
>> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
>> posted his proposed patch here:
>>
>> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/
> ...that approach is not making any progress either?

Jan's approach does fix it on my system. There was some debate
about what the kernel nopat option should do, though. I don't
have a strong opinion on that and would accept Jan's patch
as a fix.

>
> Jan, can could provide a short status update here? I'd really like to
> get this regression fixed one way or another rather sooner than later,
> as this is taken way to long already IMHO.

I hope something is done soon also.

>
>> The only reservation I have about Jan's patch is that the commit
>> message does not clearly explain how the patch changes what
>> the nopat kernel boot option does. It doesn't affect me because
>> I don't use nopat, but it should probably be mentioned in the
>> commit message, as pointed out here:
>>
>> https://lore.kernel.org/lkml/bd9ed2c2-1337-27bb-c9da-dfc7b31d492c@netscape.net/
>>
>>
>> Whatever fix for the regression exposed by bdd8b6c98239 also
>> needs to be backported to the stable versions 5.17 and 5.18.
> Sure.
>
> BTW, as you seem to be familiar with the issue: there is another report
> about a regression WRT to Xen and i915 (that is also not making really
> progress):
> https://lore.kernel.org/lkml/Yn%2FTgj1Ehs%2FBdpHp@itl-email/
>
> It's just a wild guess, but bould this somehow be related?

It could be, but I do not run a GUI in my Xen Dom0, so I have not
seen that issue.

Best regards,

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
@ 2022-05-25 19:25                               ` Chuck Zmudzinski
  0 siblings, 0 replies; 80+ messages in thread
From: Chuck Zmudzinski @ 2022-05-25 19:25 UTC (permalink / raw)
  To: Thorsten Leemhuis, Jan Beulich, regressions, stable
  Cc: Juergen Gross, Peter Zijlstra, intel-gfx, Dave Hansen, x86,
	linux-kernel, David Airlie, Rodrigo Vivi, Ingo Molnar,
	Borislav Petkov, dri-devel, Andy Lutomirski, H. Peter Anvin,
	xen-devel, Thomas Gleixner

On 5/25/2022 3:45 AM, Thorsten Leemhuis wrote:
> On 24.05.22 20:32, Chuck Zmudzinski wrote:
>> On 5/21/22 6:47 AM, Thorsten Leemhuis wrote:
>>> On 20.05.22 16:48, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 10:06 AM, Jan Beulich wrote:
>>>>> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>>>>>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>>>>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>>>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>>>>>> those want
>>>>>>>>>>>>> leaving unaltered. When preparing my earlier patch I did
>>>>>>>>>>>>> inspect them
>>>>>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>>>>>> observe the
>>>>>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled()
>>>>>>>>>>>>> as the
>>>>>>>>>>>>> only predicate). In fact, as said in the description of my
>>>>>>>>>>>>> earlier
>>>>>>>>>>>>> patch, in
>>>>>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map()
>>>>>>>>>>>>> to be
>>>>>>>>>>>>> the
>>>>>>>>>>>>> problematic one, which you leave alone.
>>>>>>>>>>>> Oh, I missed that one, sorry.
>>>>>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>>>>>
>>>>>>>>>>>> I wanted to be rather defensive in my changes, but I agree at
>>>>>>>>>>>> least
>>>>>>>>>>>> the
>>>>>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>>>>>
>>>>>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>>>>>> should not override that,
>>>>>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>>>>>> an override would affect only the single domain where the
>>>>>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>>>>>
>>>>>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>>>>>> (but tell us "don't do that then").
>>>>>>>> Actually I just did a test with the last official Debian kernel
>>>>>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>>>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>>>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>>>>>> normally on both the bare metal and on the Xen hypervisor.
>>>>>>>> That means your presumption (and the presumption of
>>>>>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>>>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>>>>>> had no effect on my system with Linux 5.16. So after doing these
>>>>>>>> tests, I am against the aggressive approach of breaking the i915
>>>>>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>>>>>> nopat did not break the i915 driver. Why break it now?
>>>>>>> Because that's, in my understanding, is the purpose of "nopat"
>>>>>>> (not breaking the driver of course - that's a driver bug -, but
>>>>>>> having an effect on the driver).
>>>>>> I wouldn't call it a driver bug, but an incorrect configuration of the
>>>>>> kernel by the user.  I presume X86_FEATURE_PAT is required by the
>>>>>> i915 driver
>>>>> The driver ought to work fine without PAT (and hence without being
>>>>> able to make WC mappings). It would use UC instead and be slow, but
>>>>> it ought to work.
>>>>>
>>>>>> and therefore the driver should refuse to disable
>>>>>> it if the user requests to disable it and instead warn the user that
>>>>>> the driver did not disable the feature, contrary to what the user
>>>>>> requested with the nopat option.
>>>>>>
>>>>>> In any case, my test did not verify that when nopat is set in Linux
>>>>>> 5.16,
>>>>>> the thread takes the same code path as when nopat is not set,
>>>>>> so I am not totally sure that the reason nopat does not break the
>>>>>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>>>>>> returns true even when nopat is set. I could test it with a custom
>>>>>> log message in 5.16 if that is necessary.
>>>>>>
>>>>>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>>>>>> to return true in 5.16 when the user requests nopat?
>>>>> No, I'm not saying that. It was wrong for this construct to be used
>>>>> in the driver, which was fixed for 5.17 (and which had caused the
>>>>> regression I did observe, leading to the patch as a hopefully least
>>>>> bad option).
>>>>>
>>>>>> I think that is
>>>>>> just permitting a bad configuration to break the driver that a
>>>>>> well-written operating system should not allow. The i915 driver
>>>>>> was, in my opinion, correctly ignoring the nopat option in 5.16
>>>>>> because that option is not compatible with the hardware the
>>>>>> i915 driver is trying to initialize and setup at boot time. At least
>>>>>> that is my understanding now, but I will need to test it on 5.16
>>>>>> to be sure I understand it correctly.
>>>>>>
>>>>>> Also, AFAICT, your patch would break the driver when the nopat
>>>>>> option is set and only fix the regression introduced by bdd8b6c98239
>>>>>> when nopat is not set on my box, so your patch would
>>>>>> introduce a regression relative to Linux 5.16 and earlier for the
>>>>>> case when nopat is set on my box. I think your point would
>>>>>> be that it is not a regression if it is an incorrect user
>>>>>> configuration.
>>>>> Again no - my view is that there's a separate, pre-existing issue
>>>>> in the driver which was uncovered by the change. This may be a
>>>>> perceived regression, but is imo different from a real one.
>>> Sorry, for you maybe, but I'm pretty sure for Linus it's not when it
>>> comes to the "no regressions rule". Just took a quick look at quotes
>>> from Linus
>>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>> and found this statement from Linus to back this up:
>>>
>>> ```
>>> One _particularly_ last-minute revert is the top-most commit (ignoring
>>> the version change itself) done just before the release, and while
>>> it's very annoying, it's perhaps also instructive.
>>>
>>> What's instructive about it is that I reverted a commit that wasn't
>>> actually buggy. In fact, it was doing exactly what it set out to do,
>>> and did it very well. In fact it did it _so_ well that the much
>>> improved IO patterns it caused then ended up revealing a user-visible
>>> regression due to a real bug in a completely unrelated area.
>>> ```
>>>
>>> He said that here:
>>> https://www.kernel.org/doc/html/latest/process/handling-regressions.html
>>>
>>> The situation is of course different here, but similar enough.
>>>
>>>> Since it is a regression, I think for now bdd8b6c98239 should
>>>> be reverted and the fix backported to Linux 5.17 stable until
>>>> the underlying memory subsystem can provide the i915 driver
>>>> with an updated test for the PAT feature that also meets the
>>>> requirements of the author of bdd8b6c98239 without breaking
>>>> the i915 driver.
>>> I'm not a developer and I'm don't known the details of this thread and
>>> the backstory of the regression, but it sounds like that's the approach
>>> that is needed here until someone comes up with a fix for the regression
>>> exposed by bdd8b6c98239.
>>>
>>> But if I'm wrong, please tell me.
>> You are mostly right, I think. Reverting bdd8b6c98239 fixes
>> it. There is another way to fix it, though.
> Yeah, I'm aware of it. But it seems...
>
>> The patch proposed
>> by Jan Beulich also fixes the regression on my system, so as
>> the person reporting this is a regression, I would also be satisfied
>> with Jan's patch instead of reverting bdd8b6c98239 as a fix. Jan
>> posted his proposed patch here:
>>
>> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/
> ...that approach is not making any progress either?

Jan's approach does fix it on my system. There was some debate
about what the kernel nopat option should do, though. I don't
have a strong opinion on that and would accept Jan's patch
as a fix.

>
> Jan, can could provide a short status update here? I'd really like to
> get this regression fixed one way or another rather sooner than later,
> as this is taken way to long already IMHO.

I hope something is done soon also.

>
>> The only reservation I have about Jan's patch is that the commit
>> message does not clearly explain how the patch changes what
>> the nopat kernel boot option does. It doesn't affect me because
>> I don't use nopat, but it should probably be mentioned in the
>> commit message, as pointed out here:
>>
>> https://lore.kernel.org/lkml/bd9ed2c2-1337-27bb-c9da-dfc7b31d492c@netscape.net/
>>
>>
>> Whatever fix for the regression exposed by bdd8b6c98239 also
>> needs to be backported to the stable versions 5.17 and 5.18.
> Sure.
>
> BTW, as you seem to be familiar with the issue: there is another report
> about a regression WRT to Xen and i915 (that is also not making really
> progress):
> https://lore.kernel.org/lkml/Yn%2FTgj1Ehs%2FBdpHp@itl-email/
>
> It's just a wild guess, but bould this somehow be related?

It could be, but I do not run a GUI in my Xen Dom0, so I have not
seen that issue.

Best regards,

Chuck

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 1/2] x86/pat: fix x86_has_pat_wp()
  2022-05-03 13:22 ` [PATCH 1/2] x86/pat: fix x86_has_pat_wp() Juergen Gross
@ 2022-05-27 10:21   ` Juergen Gross
  2022-06-14 15:09   ` Juergen Gross
  2022-06-20 10:26   ` Borislav Petkov
  2 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-05-27 10:21 UTC (permalink / raw)
  To: xen-devel, x86, linux-kernel
  Cc: jbeulich, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin


[-- Attachment #1.1.1: Type: text/plain, Size: 1134 bytes --]

Ping?

On 03.05.22 15:22, Juergen Gross wrote:
> x86_has_pat_wp() is using a wrong test, as it relies on the normal
> PAT configuration used by the kernel. In case the PAT MSR has been
> setup by another entity (e.g. BIOS or Xen hypervisor) it might return
> false even if the PAT configuration is allowing WP mappings.
> 
> Fixes: 1f6f655e01ad ("x86/mm: Add a x86_has_pat_wp() helper")
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>   arch/x86/mm/init.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index d8cfce221275..71e182ebced3 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -80,7 +80,8 @@ static uint8_t __pte2cachemode_tbl[8] = {
>   /* Check that the write-protect PAT entry is set for write-protect */
>   bool x86_has_pat_wp(void)
>   {
> -	return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] == _PAGE_CACHE_MODE_WP;
> +	return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
> +	       _PAGE_CACHE_MODE_WP;
>   }
>   
>   enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)


[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 1/2] x86/pat: fix x86_has_pat_wp()
  2022-05-03 13:22 ` [PATCH 1/2] x86/pat: fix x86_has_pat_wp() Juergen Gross
  2022-05-27 10:21   ` Juergen Gross
@ 2022-06-14 15:09   ` Juergen Gross
  2022-06-20  5:22     ` Thorsten Leemhuis
  2022-06-20 10:26   ` Borislav Petkov
  2 siblings, 1 reply; 80+ messages in thread
From: Juergen Gross @ 2022-06-14 15:09 UTC (permalink / raw)
  To: xen-devel, x86, linux-kernel, Thomas Gleixner, Ingo Molnar,
	Dave Hansen, Borislav Petkov
  Cc: jbeulich, Andy Lutomirski, Peter Zijlstra, H. Peter Anvin


[-- Attachment #1.1.1: Type: text/plain, Size: 1263 bytes --]

On 03.05.22 15:22, Juergen Gross wrote:
> x86_has_pat_wp() is using a wrong test, as it relies on the normal
> PAT configuration used by the kernel. In case the PAT MSR has been
> setup by another entity (e.g. BIOS or Xen hypervisor) it might return
> false even if the PAT configuration is allowing WP mappings.
> 
> Fixes: 1f6f655e01ad ("x86/mm: Add a x86_has_pat_wp() helper")
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>   arch/x86/mm/init.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index d8cfce221275..71e182ebced3 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -80,7 +80,8 @@ static uint8_t __pte2cachemode_tbl[8] = {
>   /* Check that the write-protect PAT entry is set for write-protect */
>   bool x86_has_pat_wp(void)
>   {
> -	return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] == _PAGE_CACHE_MODE_WP;
> +	return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
> +	       _PAGE_CACHE_MODE_WP;
>   }
>   
>   enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)

x86 maintainers, please consider taking this patch, as it is fixing
a real bug. Patch 2 of this series can be dropped IMO.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 1/2] x86/pat: fix x86_has_pat_wp()
  2022-06-14 15:09   ` Juergen Gross
@ 2022-06-20  5:22     ` Thorsten Leemhuis
  2022-06-20  5:30       ` Juergen Gross
  0 siblings, 1 reply; 80+ messages in thread
From: Thorsten Leemhuis @ 2022-06-20  5:22 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, x86, linux-kernel, Thomas Gleixner,
	Ingo Molnar, Dave Hansen, Borislav Petkov
  Cc: jbeulich, Andy Lutomirski, Peter Zijlstra, H. Peter Anvin

On 14.06.22 17:09, Juergen Gross wrote:
> On 03.05.22 15:22, Juergen Gross wrote:
>> x86_has_pat_wp() is using a wrong test, as it relies on the normal
>> PAT configuration used by the kernel. In case the PAT MSR has been
>> setup by another entity (e.g. BIOS or Xen hypervisor) it might return
>> false even if the PAT configuration is allowing WP mappings.
>>
>> Fixes: 1f6f655e01ad ("x86/mm: Add a x86_has_pat_wp() helper")
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>   arch/x86/mm/init.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
>> index d8cfce221275..71e182ebced3 100644
>> --- a/arch/x86/mm/init.c
>> +++ b/arch/x86/mm/init.c
>> @@ -80,7 +80,8 @@ static uint8_t __pte2cachemode_tbl[8] = {
>>   /* Check that the write-protect PAT entry is set for write-protect */
>>   bool x86_has_pat_wp(void)
>>   {
>> -    return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] ==
>> _PAGE_CACHE_MODE_WP;
>> +    return
>> __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
>> +           _PAGE_CACHE_MODE_WP;
>>   }
>>     enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
> 
> x86 maintainers, please consider taking this patch, as it is fixing
> a real bug. Patch 2 of this series can be dropped IMO.

Juergen, can you help me out here please. Patch 2 afaics was supposed to
fix this regression I'm tracking:
https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/

Is Patch 1 alone enough to fix it? Or is there a different fix for it?
Or is there some other solution to finally fix that regressions that
ideally should have been fixed weeks ago already?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 1/2] x86/pat: fix x86_has_pat_wp()
  2022-06-20  5:22     ` Thorsten Leemhuis
@ 2022-06-20  5:30       ` Juergen Gross
  2022-06-20  6:15         ` Thorsten Leemhuis
  0 siblings, 1 reply; 80+ messages in thread
From: Juergen Gross @ 2022-06-20  5:30 UTC (permalink / raw)
  To: Thorsten Leemhuis, xen-devel, x86, linux-kernel, Thomas Gleixner,
	Ingo Molnar, Dave Hansen, Borislav Petkov
  Cc: jbeulich, Andy Lutomirski, Peter Zijlstra, H. Peter Anvin


[-- Attachment #1.1.1: Type: text/plain, Size: 2508 bytes --]

On 20.06.22 07:22, Thorsten Leemhuis wrote:
> On 14.06.22 17:09, Juergen Gross wrote:
>> On 03.05.22 15:22, Juergen Gross wrote:
>>> x86_has_pat_wp() is using a wrong test, as it relies on the normal
>>> PAT configuration used by the kernel. In case the PAT MSR has been
>>> setup by another entity (e.g. BIOS or Xen hypervisor) it might return
>>> false even if the PAT configuration is allowing WP mappings.
>>>
>>> Fixes: 1f6f655e01ad ("x86/mm: Add a x86_has_pat_wp() helper")
>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>> ---
>>>    arch/x86/mm/init.c | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
>>> index d8cfce221275..71e182ebced3 100644
>>> --- a/arch/x86/mm/init.c
>>> +++ b/arch/x86/mm/init.c
>>> @@ -80,7 +80,8 @@ static uint8_t __pte2cachemode_tbl[8] = {
>>>    /* Check that the write-protect PAT entry is set for write-protect */
>>>    bool x86_has_pat_wp(void)
>>>    {
>>> -    return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] ==
>>> _PAGE_CACHE_MODE_WP;
>>> +    return
>>> __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
>>> +           _PAGE_CACHE_MODE_WP;
>>>    }
>>>      enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
>>
>> x86 maintainers, please consider taking this patch, as it is fixing
>> a real bug. Patch 2 of this series can be dropped IMO.
> 
> Juergen, can you help me out here please. Patch 2 afaics was supposed to
> fix this regression I'm tracking:
> https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/

No, patch 2 wasn't covering all needed cases.

> Is Patch 1 alone enough to fix it? Or is there a different fix for it?

Patch 1 is fixing a different issue (it is lacking any maintainer
feedback, though).

This patch of Jan should do the job, but it seems to be stuck, too:

https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/

> Or is there some other solution to finally fix that regressions that
> ideally should have been fixed weeks ago already?

I agree it should have been fixed quite some time now, but the x86
maintainers don't seem to be interested in those stuck patches. :-(

Maybe I should take a different approach:

x86 maintainers, please speak up if you NAK (or Ack) any of above two patches.
In case you don't NAK or take the patches, I'm inclined to carry them via
the Xen tree to get the issues fixed.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 1/2] x86/pat: fix x86_has_pat_wp()
  2022-06-20  5:30       ` Juergen Gross
@ 2022-06-20  6:15         ` Thorsten Leemhuis
  0 siblings, 0 replies; 80+ messages in thread
From: Thorsten Leemhuis @ 2022-06-20  6:15 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, x86, linux-kernel, Thomas Gleixner,
	Ingo Molnar, Dave Hansen, Borislav Petkov
  Cc: jbeulich, Andy Lutomirski, Peter Zijlstra, H. Peter Anvin

On 20.06.22 07:30, Juergen Gross wrote:
> On 20.06.22 07:22, Thorsten Leemhuis wrote:
>> On 14.06.22 17:09, Juergen Gross wrote:
>>> On 03.05.22 15:22, Juergen Gross wrote:
>>>> x86_has_pat_wp() is using a wrong test, as it relies on the normal
>>>> PAT configuration used by the kernel. In case the PAT MSR has been
>>>> setup by another entity (e.g. BIOS or Xen hypervisor) it might return
>>>> false even if the PAT configuration is allowing WP mappings.
>>>>
>>>> Fixes: 1f6f655e01ad ("x86/mm: Add a x86_has_pat_wp() helper")
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> ---
>>>>    arch/x86/mm/init.c | 3 ++-
>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
>>>> index d8cfce221275..71e182ebced3 100644
>>>> --- a/arch/x86/mm/init.c
>>>> +++ b/arch/x86/mm/init.c
>>>> @@ -80,7 +80,8 @@ static uint8_t __pte2cachemode_tbl[8] = {
>>>>    /* Check that the write-protect PAT entry is set for
>>>> write-protect */
>>>>    bool x86_has_pat_wp(void)
>>>>    {
>>>> -    return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] ==
>>>> _PAGE_CACHE_MODE_WP;
>>>> +    return
>>>> __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
>>>> +           _PAGE_CACHE_MODE_WP;
>>>>    }
>>>>      enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
>>>
>>> x86 maintainers, please consider taking this patch, as it is fixing
>>> a real bug. Patch 2 of this series can be dropped IMO.
>>
>> Juergen, can you help me out here please. Patch 2 afaics was supposed to
>> fix this regression I'm tracking:
>> https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/
> No, patch 2 wasn't covering all needed cases.

Ahh, happens. Thx for the info.

>> Is Patch 1 alone enough to fix it? Or is there a different fix for it?
> Patch 1 is fixing a different issue (it is lacking any maintainer
> feedback, though).
> 
> This patch of Jan should do the job, but it seems to be stuck, too:
> https://lore.kernel.org/lkml/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com/

Ahh. Fun fact: that was on my list of things to prod, too.

>> Or is there some other solution to finally fix that regressions that
>> ideally should have been fixed weeks ago already?
> 
> I agree it should have been fixed quite some time now, but the x86
> maintainers don't seem to be interested in those stuck patches. :-(
> 
> Maybe I should take a different approach:
> 
> x86 maintainers, please speak up if you NAK (or Ack) any of above two
> patches.
> In case you don't NAK or take the patches, I'm inclined to carry them via
> the Xen tree to get the issues fixed.

Yeah, I'd be really glad if we could find a solution for this situation
and get it finally fixed in mainline and backported to stable.

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 1/2] x86/pat: fix x86_has_pat_wp()
  2022-05-03 13:22 ` [PATCH 1/2] x86/pat: fix x86_has_pat_wp() Juergen Gross
  2022-05-27 10:21   ` Juergen Gross
  2022-06-14 15:09   ` Juergen Gross
@ 2022-06-20 10:26   ` Borislav Petkov
  2022-06-20 10:41     ` Juergen Gross
  2 siblings, 1 reply; 80+ messages in thread
From: Borislav Petkov @ 2022-06-20 10:26 UTC (permalink / raw)
  To: Juergen Gross
  Cc: xen-devel, x86, linux-kernel, jbeulich, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin

On Tue, May 03, 2022 at 03:22:06PM +0200, Juergen Gross wrote:
> x86_has_pat_wp() is using a wrong test, as it relies on the normal
> PAT configuration used by the kernel. In case the PAT MSR has been
> setup by another entity (e.g. BIOS or Xen hypervisor) it might return
> false even if the PAT configuration is allowing WP mappings.
> 
> Fixes: 1f6f655e01ad ("x86/mm: Add a x86_has_pat_wp() helper")
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>  arch/x86/mm/init.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index d8cfce221275..71e182ebced3 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -80,7 +80,8 @@ static uint8_t __pte2cachemode_tbl[8] = {
>  /* Check that the write-protect PAT entry is set for write-protect */
>  bool x86_has_pat_wp(void)
>  {
> -	return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] == _PAGE_CACHE_MODE_WP;
> +	return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
> +	       _PAGE_CACHE_MODE_WP;

So this code always makes my head spin... especially after vacation but
lemme take a stab:

__pte2cachemode_tbl indices are of type enum page_cache_mode.

What you've done is index with

__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]

which gives uint16_t.

So, if at all, this should do __pte2cm_idx(_PAGE_CACHE_MODE_WP) to index
into it.

But I'm still unclear on the big picture. Looking at Jan's explanation,
there's something about PAT init being skipped due to MTRRs not being
emulated by Xen.... or something to that effect.

So if that's the case, the Xen guest code should init PAT in its own
way, so that the generic code works with this without doing hacks.

But I'm only guessing - this needs a *lot* more elaboration and
explanation why exactly this is needed.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 1/2] x86/pat: fix x86_has_pat_wp()
  2022-06-20 10:26   ` Borislav Petkov
@ 2022-06-20 10:41     ` Juergen Gross
  2022-06-20 15:27       ` Dave Hansen
  0 siblings, 1 reply; 80+ messages in thread
From: Juergen Gross @ 2022-06-20 10:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: xen-devel, x86, linux-kernel, jbeulich, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin


[-- Attachment #1.1.1: Type: text/plain, Size: 2702 bytes --]

On 20.06.22 12:26, Borislav Petkov wrote:
> On Tue, May 03, 2022 at 03:22:06PM +0200, Juergen Gross wrote:
>> x86_has_pat_wp() is using a wrong test, as it relies on the normal
>> PAT configuration used by the kernel. In case the PAT MSR has been
>> setup by another entity (e.g. BIOS or Xen hypervisor) it might return
>> false even if the PAT configuration is allowing WP mappings.
>>
>> Fixes: 1f6f655e01ad ("x86/mm: Add a x86_has_pat_wp() helper")
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>   arch/x86/mm/init.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
>> index d8cfce221275..71e182ebced3 100644
>> --- a/arch/x86/mm/init.c
>> +++ b/arch/x86/mm/init.c
>> @@ -80,7 +80,8 @@ static uint8_t __pte2cachemode_tbl[8] = {
>>   /* Check that the write-protect PAT entry is set for write-protect */
>>   bool x86_has_pat_wp(void)
>>   {
>> -	return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] == _PAGE_CACHE_MODE_WP;
>> +	return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
>> +	       _PAGE_CACHE_MODE_WP;
> 
> So this code always makes my head spin... especially after vacation but
> lemme take a stab:
> 
> __pte2cachemode_tbl indices are of type enum page_cache_mode.

Yes.

> What you've done is index with
> 
> __cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]
> 
> which gives uint16_t.
> 
> So, if at all, this should do __pte2cm_idx(_PAGE_CACHE_MODE_WP) to index
> into it.

Oh, you are partially right.

It should be __pte2cm_idx(__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]).

> But I'm still unclear on the big picture. Looking at Jan's explanation,
> there's something about PAT init being skipped due to MTRRs not being
> emulated by Xen.... or something to that effect.

PAT init is being skipped for Xen PV guests, as those can't write the
PAT MSR. They need to cope with the setting the hypervisor has done
(which contains all caching modes, but in a different layout than the
kernel is using normally).

> So if that's the case, the Xen guest code should init PAT in its own
> way, so that the generic code works with this without doing hacks.

Depends on what you mean with "init PAT". If you mean to write the
PAT MSR, then no, this won't work. If you mean to setup the translation
arrays __cachemode2pte_tbl[] and __pte2cachemode_tbl[], then yes, this
is already done.

My patch is only fixing the wrong way querying for WP being supported.

> But I'm only guessing - this needs a *lot* more elaboration and
> explanation why exactly this is needed.

I will correct the code and update the commit message.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 1/2] x86/pat: fix x86_has_pat_wp()
  2022-06-20 10:41     ` Juergen Gross
@ 2022-06-20 15:27       ` Dave Hansen
  2022-06-20 15:34         ` Juergen Gross
  0 siblings, 1 reply; 80+ messages in thread
From: Dave Hansen @ 2022-06-20 15:27 UTC (permalink / raw)
  To: Juergen Gross, Borislav Petkov
  Cc: xen-devel, x86, linux-kernel, jbeulich, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin

On 6/20/22 03:41, Juergen Gross wrote:
>> But I'm only guessing - this needs a *lot* more elaboration and
>> explanation why exactly this is needed.
> 
> I will correct the code and update the commit message.

It would also be great to cover the end-user-visible impact of the bug
and the fix.  It _looks_ like it will probably only affect an SEV
system's ability to read some EFI data.  That will presumably be pretty
bad because it ends up reading from an encrypted mapping instead of a
decrypted one.

The

	pr_warn("failed to early memremap...

is (counterintuitively) what is wanted here.

Right?

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 1/2] x86/pat: fix x86_has_pat_wp()
  2022-06-20 15:27       ` Dave Hansen
@ 2022-06-20 15:34         ` Juergen Gross
  0 siblings, 0 replies; 80+ messages in thread
From: Juergen Gross @ 2022-06-20 15:34 UTC (permalink / raw)
  To: Dave Hansen, Borislav Petkov
  Cc: xen-devel, x86, linux-kernel, jbeulich, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin


[-- Attachment #1.1.1: Type: text/plain, Size: 734 bytes --]

On 20.06.22 17:27, Dave Hansen wrote:
> On 6/20/22 03:41, Juergen Gross wrote:
>>> But I'm only guessing - this needs a *lot* more elaboration and
>>> explanation why exactly this is needed.
>>
>> I will correct the code and update the commit message.
> 
> It would also be great to cover the end-user-visible impact of the bug
> and the fix.  It _looks_ like it will probably only affect an SEV
> system's ability to read some EFI data.  That will presumably be pretty
> bad because it ends up reading from an encrypted mapping instead of a
> decrypted one.

Xen doesn't support SEV guests yet. So the only caveat here would be EFI
setting up PAT by itself.

Not sure this is really a real world issue.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [tip: x86/urgent] x86/pat: Fix x86_has_pat_wp()
  2022-05-03 13:22 ` Juergen Gross
                   ` (3 preceding siblings ...)
  (?)
@ 2022-07-11  9:46 ` tip-bot2 for Juergen Gross
  -1 siblings, 0 replies; 80+ messages in thread
From: tip-bot2 for Juergen Gross @ 2022-07-11  9:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Juergen Gross, Borislav Petkov, stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     da4600c76da7d787db04ce059b1f176da8a8d375
Gitweb:        https://git.kernel.org/tip/da4600c76da7d787db04ce059b1f176da8a8d375
Author:        Juergen Gross <jgross@suse.com>
AuthorDate:    Fri, 08 Jul 2022 15:14:56 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 11 Jul 2022 11:37:03 +02:00

x86/pat: Fix x86_has_pat_wp()

x86_has_pat_wp() is using a wrong test, as it relies on the normal
PAT configuration used by the kernel. In case the PAT MSR has been
setup by another entity (e.g. Xen hypervisor) it might return false
even if the PAT configuration is allowing WP mappings. This due to the
fact that when running as Xen PV guest the PAT MSR is setup by the
hypervisor and cannot be changed by the guest. This results in the WP
related entry to be at a different position when running as Xen PV
guest compared to the bare metal or fully virtualized case.

The correct way to test for WP support is:

1. Get the PTE protection bits needed to select WP mode by reading
   __cachemode2pte_tbl[_PAGE_CACHE_MODE_WP] (depending on the PAT MSR
   setting this might return protection bits for a stronger mode, e.g.
   UC-)
2. Translate those bits back into the real cache mode selected by those
   PTE bits by reading __pte2cachemode_tbl[__pte2cm_idx(prot)]
3. Test for the cache mode to be _PAGE_CACHE_MODE_WP

Fixes: f88a68facd9a ("x86/mm: Extend early_memremap() support with additional attrs")
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.14
Link: https://lore.kernel.org/r/20220503132207.17234-1-jgross@suse.com
---
 arch/x86/mm/init.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index d8cfce2..57ba550 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -77,10 +77,20 @@ static uint8_t __pte2cachemode_tbl[8] = {
 	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
 };
 
-/* Check that the write-protect PAT entry is set for write-protect */
+/*
+ * Check that the write-protect PAT entry is set for write-protect.
+ * To do this without making assumptions how PAT has been set up (Xen has
+ * another layout than the kernel), translate the _PAGE_CACHE_MODE_WP cache
+ * mode via the __cachemode2pte_tbl[] into protection bits (those protection
+ * bits will select a cache mode of WP or better), and then translate the
+ * protection bits back into the cache mode using __pte2cm_idx() and the
+ * __pte2cachemode_tbl[] array. This will return the really used cache mode.
+ */
 bool x86_has_pat_wp(void)
 {
-	return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] == _PAGE_CACHE_MODE_WP;
+	uint16_t prot = __cachemode2pte_tbl[_PAGE_CACHE_MODE_WP];
+
+	return __pte2cachemode_tbl[__pte2cm_idx(prot)] == _PAGE_CACHE_MODE_WP;
 }
 
 enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [tip: x86/urgent] x86/pat: Fix x86_has_pat_wp()
  2022-05-03 13:22 ` Juergen Gross
                   ` (4 preceding siblings ...)
  (?)
@ 2022-07-13 10:45 ` tip-bot2 for Juergen Gross
  -1 siblings, 0 replies; 80+ messages in thread
From: tip-bot2 for Juergen Gross @ 2022-07-13 10:45 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Juergen Gross, Borislav Petkov, stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     f0592491eba2e42cc84d7831750b84cfc0150dde
Gitweb:        https://git.kernel.org/tip/f0592491eba2e42cc84d7831750b84cfc0150dde
Author:        Juergen Gross <jgross@suse.com>
AuthorDate:    Fri, 08 Jul 2022 15:14:56 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 13 Jul 2022 12:21:03 +02:00

x86/pat: Fix x86_has_pat_wp()

x86_has_pat_wp() is using a wrong test, as it relies on the normal
PAT configuration used by the kernel. In case the PAT MSR has been
setup by another entity (e.g. Xen hypervisor) it might return false
even if the PAT configuration is allowing WP mappings. This due to the
fact that when running as Xen PV guest the PAT MSR is setup by the
hypervisor and cannot be changed by the guest. This results in the WP
related entry to be at a different position when running as Xen PV
guest compared to the bare metal or fully virtualized case.

The correct way to test for WP support is:

1. Get the PTE protection bits needed to select WP mode by reading
   __cachemode2pte_tbl[_PAGE_CACHE_MODE_WP] (depending on the PAT MSR
   setting this might return protection bits for a stronger mode, e.g.
   UC-)
2. Translate those bits back into the real cache mode selected by those
   PTE bits by reading __pte2cachemode_tbl[__pte2cm_idx(prot)]
3. Test for the cache mode to be _PAGE_CACHE_MODE_WP

Fixes: f88a68facd9a ("x86/mm: Extend early_memremap() support with additional attrs")
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.14
Link: https://lore.kernel.org/r/20220503132207.17234-1-jgross@suse.com
---
 arch/x86/mm/init.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index d8cfce2..57ba550 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -77,10 +77,20 @@ static uint8_t __pte2cachemode_tbl[8] = {
 	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
 };
 
-/* Check that the write-protect PAT entry is set for write-protect */
+/*
+ * Check that the write-protect PAT entry is set for write-protect.
+ * To do this without making assumptions how PAT has been set up (Xen has
+ * another layout than the kernel), translate the _PAGE_CACHE_MODE_WP cache
+ * mode via the __cachemode2pte_tbl[] into protection bits (those protection
+ * bits will select a cache mode of WP or better), and then translate the
+ * protection bits back into the cache mode using __pte2cm_idx() and the
+ * __pte2cachemode_tbl[] array. This will return the really used cache mode.
+ */
 bool x86_has_pat_wp(void)
 {
-	return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] == _PAGE_CACHE_MODE_WP;
+	uint16_t prot = __cachemode2pte_tbl[_PAGE_CACHE_MODE_WP];
+
+	return __pte2cachemode_tbl[__pte2cm_idx(prot)] == _PAGE_CACHE_MODE_WP;
 }
 
 enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [tip: x86/urgent] x86/pat: Fix x86_has_pat_wp()
  2022-05-03 13:22 ` Juergen Gross
                   ` (5 preceding siblings ...)
  (?)
@ 2022-07-13 10:52 ` tip-bot2 for Juergen Gross
  -1 siblings, 0 replies; 80+ messages in thread
From: tip-bot2 for Juergen Gross @ 2022-07-13 10:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Juergen Gross, Borislav Petkov, stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     230ec83d4299b30c51a1c133b4f2a669972cc08a
Gitweb:        https://git.kernel.org/tip/230ec83d4299b30c51a1c133b4f2a669972cc08a
Author:        Juergen Gross <jgross@suse.com>
AuthorDate:    Fri, 08 Jul 2022 15:14:56 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 13 Jul 2022 12:44:04 +02:00

x86/pat: Fix x86_has_pat_wp()

x86_has_pat_wp() is using a wrong test, as it relies on the normal
PAT configuration used by the kernel. In case the PAT MSR has been
setup by another entity (e.g. Xen hypervisor) it might return false
even if the PAT configuration is allowing WP mappings. This due to the
fact that when running as Xen PV guest the PAT MSR is setup by the
hypervisor and cannot be changed by the guest. This results in the WP
related entry to be at a different position when running as Xen PV
guest compared to the bare metal or fully virtualized case.

The correct way to test for WP support is:

1. Get the PTE protection bits needed to select WP mode by reading
   __cachemode2pte_tbl[_PAGE_CACHE_MODE_WP] (depending on the PAT MSR
   setting this might return protection bits for a stronger mode, e.g.
   UC-)
2. Translate those bits back into the real cache mode selected by those
   PTE bits by reading __pte2cachemode_tbl[__pte2cm_idx(prot)]
3. Test for the cache mode to be _PAGE_CACHE_MODE_WP

Fixes: f88a68facd9a ("x86/mm: Extend early_memremap() support with additional attrs")
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.14
Link: https://lore.kernel.org/r/20220503132207.17234-1-jgross@suse.com
---
 arch/x86/mm/init.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index d8cfce2..57ba550 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -77,10 +77,20 @@ static uint8_t __pte2cachemode_tbl[8] = {
 	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
 };
 
-/* Check that the write-protect PAT entry is set for write-protect */
+/*
+ * Check that the write-protect PAT entry is set for write-protect.
+ * To do this without making assumptions how PAT has been set up (Xen has
+ * another layout than the kernel), translate the _PAGE_CACHE_MODE_WP cache
+ * mode via the __cachemode2pte_tbl[] into protection bits (those protection
+ * bits will select a cache mode of WP or better), and then translate the
+ * protection bits back into the cache mode using __pte2cm_idx() and the
+ * __pte2cachemode_tbl[] array. This will return the really used cache mode.
+ */
 bool x86_has_pat_wp(void)
 {
-	return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] == _PAGE_CACHE_MODE_WP;
+	uint16_t prot = __cachemode2pte_tbl[_PAGE_CACHE_MODE_WP];
+
+	return __pte2cachemode_tbl[__pte2cm_idx(prot)] == _PAGE_CACHE_MODE_WP;
 }
 
 enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)

^ permalink raw reply related	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2022-07-13 10:52 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-03 13:22 [PATCH 0/2] x86/pat: fix querying available caching modes Juergen Gross
2022-05-03 13:22 ` [Intel-gfx] " Juergen Gross
2022-05-03 13:22 ` Juergen Gross
2022-05-03 13:22 ` [PATCH 1/2] x86/pat: fix x86_has_pat_wp() Juergen Gross
2022-05-27 10:21   ` Juergen Gross
2022-06-14 15:09   ` Juergen Gross
2022-06-20  5:22     ` Thorsten Leemhuis
2022-06-20  5:30       ` Juergen Gross
2022-06-20  6:15         ` Thorsten Leemhuis
2022-06-20 10:26   ` Borislav Petkov
2022-06-20 10:41     ` Juergen Gross
2022-06-20 15:27       ` Dave Hansen
2022-06-20 15:34         ` Juergen Gross
2022-05-03 13:22 ` [PATCH 2/2] x86/pat: add functions to query specific cache mode availability Juergen Gross
2022-05-03 13:22   ` [Intel-gfx] " Juergen Gross
2022-05-03 13:22   ` Juergen Gross
2022-05-04  8:31   ` Jan Beulich
2022-05-04  8:31     ` [Intel-gfx] " Jan Beulich
2022-05-04  8:31     ` Jan Beulich
2022-05-04  9:14     ` Juergen Gross
2022-05-04  9:14       ` [Intel-gfx] " Juergen Gross
2022-05-04  9:14       ` Juergen Gross
2022-05-04  9:51       ` Jan Beulich
2022-05-04  9:51         ` [Intel-gfx] " Jan Beulich
2022-05-04  9:51         ` Jan Beulich
2022-05-20  4:43       ` Chuck Zmudzinski
2022-05-20  4:43         ` Chuck Zmudzinski
2022-05-20  5:56         ` Chuck Zmudzinski
2022-05-20  5:56           ` Chuck Zmudzinski
2022-05-20  6:05         ` Jan Beulich
2022-05-20  6:05           ` Jan Beulich
2022-05-20  6:59           ` Chuck Zmudzinski
2022-05-20  6:59             ` Chuck Zmudzinski
2022-05-20  8:30             ` Chuck Zmudzinski
2022-05-20  8:30               ` Chuck Zmudzinski
2022-05-20  9:41               ` Jan Beulich
2022-05-20  9:41                 ` Jan Beulich
2022-05-20 13:33                 ` Chuck Zmudzinski
2022-05-20 13:33                   ` Chuck Zmudzinski
2022-05-20 14:06                   ` Jan Beulich
2022-05-20 14:06                     ` Jan Beulich
2022-05-20 14:48                     ` Chuck Zmudzinski
2022-05-20 14:48                       ` Chuck Zmudzinski
2022-05-21 10:47                       ` Thorsten Leemhuis
2022-05-21 10:47                         ` [Intel-gfx] " Thorsten Leemhuis
2022-05-21 10:47                         ` Thorsten Leemhuis
2022-05-24 18:32                         ` Chuck Zmudzinski
2022-05-24 18:32                           ` [Intel-gfx] " Chuck Zmudzinski
2022-05-24 18:32                           ` Chuck Zmudzinski
2022-05-25  7:45                           ` [Intel-gfx] " Thorsten Leemhuis
2022-05-25  7:45                             ` Thorsten Leemhuis
2022-05-25  7:45                             ` Thorsten Leemhuis
2022-05-25  8:04                             ` Juergen Gross
2022-05-25  8:04                               ` [Intel-gfx] " Juergen Gross
2022-05-25  8:04                               ` Juergen Gross
2022-05-25  8:37                             ` Jan Beulich
2022-05-25  8:37                               ` Jan Beulich
2022-05-25  8:51                               ` Thorsten Leemhuis
2022-05-25  8:51                                 ` [Intel-gfx] " Thorsten Leemhuis
2022-05-25  8:51                                 ` Thorsten Leemhuis
2022-05-25 19:25                             ` Chuck Zmudzinski
2022-05-25 19:25                               ` [Intel-gfx] " Chuck Zmudzinski
2022-05-25 19:25                               ` Chuck Zmudzinski
2022-05-20 15:46                     ` [REGRESSION} " Chuck Zmudzinski
2022-05-20 15:46                       ` Chuck Zmudzinski
2022-05-20 17:13                       ` Chuck Zmudzinski
2022-05-20 17:13                         ` Chuck Zmudzinski
2022-05-20 17:17                         ` Chuck Zmudzinski
2022-05-20 17:17                           ` Chuck Zmudzinski
2022-05-18 13:45   ` Christoph Hellwig
2022-05-18 13:45     ` [Intel-gfx] " Christoph Hellwig
2022-05-20  2:15   ` Chuck Zmudzinski
2022-05-20  2:15     ` Chuck Zmudzinski
2022-05-20  2:21     ` Chuck Zmudzinski
2022-05-20  2:21       ` Chuck Zmudzinski
2022-05-21 13:24   ` Chuck Zmudzinski
2022-05-21 13:24     ` Chuck Zmudzinski
2022-07-11  9:46 ` [tip: x86/urgent] x86/pat: Fix x86_has_pat_wp() tip-bot2 for Juergen Gross
2022-07-13 10:45 ` tip-bot2 for Juergen Gross
2022-07-13 10:52 ` tip-bot2 for Juergen Gross

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.