i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max

All of lore.kernel.org
 help / color / mirror / Atom feed

* i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
@ 2022-10-18  3:52 Marek Marczykowski-Górecki
  2022-10-18  8:24   ` [Intel-gfx] " Christoph Hellwig
  2022-10-18  8:31 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork
  0 siblings, 2 replies; 21+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-10-18  3:52 UTC (permalink / raw)
  To: Christoph Hellwig, Konrad Rzeszutek Wilk, Anshuman Khandual
  Cc: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	regressions, xen-devel, iommu

[-- Attachment #1: Type: text/plain, Size: 2063 bytes --]

Hi,

Since 5.19, I observe severe glitches (mostly horizontal black stripes, but
not only) when using IGD in Xen PV dom0. After not very long time Xorg
crashes, and dmesg contain messages like this:

    i915 0000:00:02.0: [drm] GPU HANG: ecode 7:1:01fffbfe, in Xorg [5337]
    i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
    i915 0000:00:02.0: [drm] Xorg[5337] context reset due to GPU hang

The issue can be observed on several different hardware (at least Ivy
Bridge, Tiger Lake and Kaby Lake). It doesn't always happen immediately,
sometimes I need to start several VMs first.
Example how it looks like:
https://openqa.qubes-os.org/tests/48187#step/qui_widgets_notifications/8

More screenshots and logs are linked at https://github.com/QubesOS/qubes-issues/issues/7813

I managed to git bisect the issue and ended up with this as the first
bad commit:

    commit a2daa27c0c6137481226aee5b3136e453c642929
    Author: Christoph Hellwig <hch@lst.de>
    Date:   Mon Feb 14 11:44:42 2022 +0100

        swiotlb: simplify swiotlb_max_segment

        Remove the bogus Xen override that was usually larger than the actual
        size and just calculate the value on demand.  Note that
        swiotlb_max_segment still doesn't make sense as an interface and should
        eventually be removed.

        Signed-off-by: Christoph Hellwig <hch@lst.de>
        Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
        Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
        Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

I tried reverting just this commit on top of 6.0.x, but the context
changed significantly in subsequent commits, so after trying reverting
it together with 3 or 4 more commits I gave up.

What may be an important detail, the system heavily uses cross-VM shared
memory (gntdev) to map window contents from VMs. This is Qubes OS, and
it uses Xen 4.14.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
  2022-10-18  3:52 i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment" Marek Marczykowski-Górecki
@ 2022-10-18  8:24   ` Christoph Hellwig
  2022-10-18  8:31 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork
  1 sibling, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2022-10-18  8:24 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Christoph Hellwig, Konrad Rzeszutek Wilk, Anshuman Khandual,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	regressions, xen-devel, iommu, Robert Beckett, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Matthew Auld,
	intel-gfx, dri-devel

On Tue, Oct 18, 2022 at 05:52:16AM +0200, Marek Marczykowski-Górecki wrote:
> not only) when using IGD in Xen PV dom0. After not very long time Xorg
> crashes, and dmesg contain messages like this:
> 
>     i915 0000:00:02.0: [drm] GPU HANG: ecode 7:1:01fffbfe, in Xorg [5337]
>     i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
>     i915 0000:00:02.0: [drm] Xorg[5337] context reset due to GPU hang

<snip>

> I tried reverting just this commit on top of 6.0.x, but the context
> changed significantly in subsequent commits, so after trying reverting
> it together with 3 or 4 more commits I gave up.
> 
> What may be an important detail, the system heavily uses cross-VM shared
> memory (gntdev) to map window contents from VMs. This is Qubes OS, and
> it uses Xen 4.14.

Can you try the patch below?

---
From 26fe4749750f1bf843666ca777e297279994e33a Mon Sep 17 00:00:00 2001
From: Robert Beckett <bob.beckett@collabora.com>
Date: Tue, 26 Jul 2022 16:39:35 +0100
Subject: drm/i915: stop abusing swiotlb_max_segment

Calling swiotlb functions directly is nowadays considered harmful. See
https://lore.kernel.org/intel-gfx/20220711082614.GA29487@lst.de/

Replace swiotlb_max_segment() calls with dma_max_mapping_size().
In i915_gem_object_get_pages_internal() no longer consider max_segment
only if CONFIG_SWIOTLB is enabled. There can be other (iommu related)
causes of specific max segment sizes.

Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
[hch: added the Xen hack]
---
 drivers/gpu/drm/i915/gem/i915_gem_internal.c | 19 +++----------
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c    |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c      |  4 +--
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c  |  2 +-
 drivers/gpu/drm/i915/i915_scatterlist.h      | 30 +++++++++++---------
 5 files changed, 25 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
index c698f95af15fe..629acb403a2c9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
@@ -6,7 +6,6 @@
 
 #include <linux/scatterlist.h>
 #include <linux/slab.h>
-#include <linux/swiotlb.h>
 
 #include "i915_drv.h"
 #include "i915_gem.h"
@@ -38,22 +37,12 @@ static int i915_gem_object_get_pages_internal(struct drm_i915_gem_object *obj)
 	struct scatterlist *sg;
 	unsigned int sg_page_sizes;
 	unsigned int npages;
-	int max_order;
+	int max_order = MAX_ORDER;
+	unsigned int max_segment;
 	gfp_t gfp;
 
-	max_order = MAX_ORDER;
-#ifdef CONFIG_SWIOTLB
-	if (is_swiotlb_active(obj->base.dev->dev)) {
-		unsigned int max_segment;
-
-		max_segment = swiotlb_max_segment();
-		if (max_segment) {
-			max_segment = max_t(unsigned int, max_segment,
-					    PAGE_SIZE) >> PAGE_SHIFT;
-			max_order = min(max_order, ilog2(max_segment));
-		}
-	}
-#endif
+	max_segment = i915_sg_segment_size(i915->drm.dev) >> PAGE_SHIFT;
+	max_order = min(max_order, get_order(max_segment));
 
 	gfp = GFP_KERNEL | __GFP_HIGHMEM | __GFP_RECLAIMABLE;
 	if (IS_I965GM(i915) || IS_I965G(i915)) {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index f42ca1179f373..11125c32dd35d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -194,7 +194,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 	struct intel_memory_region *mem = obj->mm.region;
 	struct address_space *mapping = obj->base.filp->f_mapping;
 	const unsigned long page_count = obj->base.size / PAGE_SIZE;
-	unsigned int max_segment = i915_sg_segment_size();
+	unsigned int max_segment = i915_sg_segment_size(i915->drm.dev);
 	struct sg_table *st;
 	struct sgt_iter sgt_iter;
 	struct page *page;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index e3fc38dd5db04..de5d0a7241027 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -189,7 +189,7 @@ static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
 	struct drm_i915_private *i915 = container_of(bdev, typeof(*i915), bdev);
 	struct intel_memory_region *mr = i915->mm.regions[INTEL_MEMORY_SYSTEM];
 	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
-	const unsigned int max_segment = i915_sg_segment_size();
+	const unsigned int max_segment = i915_sg_segment_size(i915->drm.dev);
 	const size_t size = (size_t)ttm->num_pages << PAGE_SHIFT;
 	struct file *filp = i915_tt->filp;
 	struct sgt_iter sgt_iter;
@@ -538,7 +538,7 @@ static struct i915_refct_sgt *i915_ttm_tt_get_st(struct ttm_tt *ttm)
 	ret = sg_alloc_table_from_pages_segment(st,
 			ttm->pages, ttm->num_pages,
 			0, (unsigned long)ttm->num_pages << PAGE_SHIFT,
-			i915_sg_segment_size(), GFP_KERNEL);
+			i915_sg_segment_size(i915_tt->dev), GFP_KERNEL);
 	if (ret) {
 		st->sgl = NULL;
 		return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 8423df021b713..e4515d6acd43c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -129,7 +129,7 @@ static void i915_gem_object_userptr_drop_ref(struct drm_i915_gem_object *obj)
 static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 {
 	const unsigned long num_pages = obj->base.size >> PAGE_SHIFT;
-	unsigned int max_segment = i915_sg_segment_size();
+	unsigned int max_segment = i915_sg_segment_size(obj->base.dev->dev);
 	struct sg_table *st;
 	unsigned int sg_page_sizes;
 	struct page **pvec;
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915_scatterlist.h
index 9ddb3e743a3e5..c278888f71528 100644
--- a/drivers/gpu/drm/i915/i915_scatterlist.h
+++ b/drivers/gpu/drm/i915/i915_scatterlist.h
@@ -9,7 +9,8 @@
 
 #include <linux/pfn.h>
 #include <linux/scatterlist.h>
-#include <linux/swiotlb.h>
+#include <linux/dma-mapping.h>
+#include <xen/xen.h>
 
 #include "i915_gem.h"
 
@@ -127,19 +128,22 @@ static inline unsigned int i915_sg_dma_sizes(struct scatterlist *sg)
 	return page_sizes;
 }
 
-static inline unsigned int i915_sg_segment_size(void)
+static inline unsigned int i915_sg_segment_size(struct device *dev)
 {
-	unsigned int size = swiotlb_max_segment();
-
-	if (size == 0)
-		size = UINT_MAX;
-
-	size = rounddown(size, PAGE_SIZE);
-	/* swiotlb_max_segment_size can return 1 byte when it means one page. */
-	if (size < PAGE_SIZE)
-		size = PAGE_SIZE;
-
-	return size;
+	size_t max = min_t(size_t, UINT_MAX, dma_max_mapping_size(dev));
+
+	/*
+	 * Xen on x86 can reshuffle pages under us.  The DMA API takes
+	 * care of that both in dma_alloc_* (by calling into the hypervisor
+	 * to make the pages contigous) and in dma_map_* (by bounce buffering).
+	 * But i915 abuses ignores the coherency aspects of the DMA API and
+	 * thus can't cope with bounce buffering actually happening, so add
+	 * a hack here to force small allocations and mapping when running on
+	 * Xen.  (good luck with TDX, btw --hch)
+	 */
+	if (IS_ENABLED(CONFIG_X86) && xen_domain())
+		max = PAGE_SIZE;
+	return round_down(max, PAGE_SIZE);
 }
 
 bool i915_sg_trim(struct sg_table *orig_st);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
@ 2022-10-18  8:24   ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2022-10-18  8:24 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Juergen Gross, Stefano Stabellini, regressions, dri-devel,
	Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
	Oleksandr Tyshchenko, iommu, Matthew Auld, Rodrigo Vivi,
	xen-devel, Christoph Hellwig

On Tue, Oct 18, 2022 at 05:52:16AM +0200, Marek Marczykowski-Górecki wrote:
> not only) when using IGD in Xen PV dom0. After not very long time Xorg
> crashes, and dmesg contain messages like this:
> 
>     i915 0000:00:02.0: [drm] GPU HANG: ecode 7:1:01fffbfe, in Xorg [5337]
>     i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
>     i915 0000:00:02.0: [drm] Xorg[5337] context reset due to GPU hang

<snip>

> I tried reverting just this commit on top of 6.0.x, but the context
> changed significantly in subsequent commits, so after trying reverting
> it together with 3 or 4 more commits I gave up.
> 
> What may be an important detail, the system heavily uses cross-VM shared
> memory (gntdev) to map window contents from VMs. This is Qubes OS, and
> it uses Xen 4.14.

Can you try the patch below?

---
From 26fe4749750f1bf843666ca777e297279994e33a Mon Sep 17 00:00:00 2001
From: Robert Beckett <bob.beckett@collabora.com>
Date: Tue, 26 Jul 2022 16:39:35 +0100
Subject: drm/i915: stop abusing swiotlb_max_segment

Calling swiotlb functions directly is nowadays considered harmful. See
https://lore.kernel.org/intel-gfx/20220711082614.GA29487@lst.de/

Replace swiotlb_max_segment() calls with dma_max_mapping_size().
In i915_gem_object_get_pages_internal() no longer consider max_segment
only if CONFIG_SWIOTLB is enabled. There can be other (iommu related)
causes of specific max segment sizes.

Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
[hch: added the Xen hack]
---
 drivers/gpu/drm/i915/gem/i915_gem_internal.c | 19 +++----------
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c    |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c      |  4 +--
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c  |  2 +-
 drivers/gpu/drm/i915/i915_scatterlist.h      | 30 +++++++++++---------
 5 files changed, 25 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
index c698f95af15fe..629acb403a2c9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
@@ -6,7 +6,6 @@
 
 #include <linux/scatterlist.h>
 #include <linux/slab.h>
-#include <linux/swiotlb.h>
 
 #include "i915_drv.h"
 #include "i915_gem.h"
@@ -38,22 +37,12 @@ static int i915_gem_object_get_pages_internal(struct drm_i915_gem_object *obj)
 	struct scatterlist *sg;
 	unsigned int sg_page_sizes;
 	unsigned int npages;
-	int max_order;
+	int max_order = MAX_ORDER;
+	unsigned int max_segment;
 	gfp_t gfp;
 
-	max_order = MAX_ORDER;
-#ifdef CONFIG_SWIOTLB
-	if (is_swiotlb_active(obj->base.dev->dev)) {
-		unsigned int max_segment;
-
-		max_segment = swiotlb_max_segment();
-		if (max_segment) {
-			max_segment = max_t(unsigned int, max_segment,
-					    PAGE_SIZE) >> PAGE_SHIFT;
-			max_order = min(max_order, ilog2(max_segment));
-		}
-	}
-#endif
+	max_segment = i915_sg_segment_size(i915->drm.dev) >> PAGE_SHIFT;
+	max_order = min(max_order, get_order(max_segment));
 
 	gfp = GFP_KERNEL | __GFP_HIGHMEM | __GFP_RECLAIMABLE;
 	if (IS_I965GM(i915) || IS_I965G(i915)) {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index f42ca1179f373..11125c32dd35d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -194,7 +194,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 	struct intel_memory_region *mem = obj->mm.region;
 	struct address_space *mapping = obj->base.filp->f_mapping;
 	const unsigned long page_count = obj->base.size / PAGE_SIZE;
-	unsigned int max_segment = i915_sg_segment_size();
+	unsigned int max_segment = i915_sg_segment_size(i915->drm.dev);
 	struct sg_table *st;
 	struct sgt_iter sgt_iter;
 	struct page *page;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index e3fc38dd5db04..de5d0a7241027 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -189,7 +189,7 @@ static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
 	struct drm_i915_private *i915 = container_of(bdev, typeof(*i915), bdev);
 	struct intel_memory_region *mr = i915->mm.regions[INTEL_MEMORY_SYSTEM];
 	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
-	const unsigned int max_segment = i915_sg_segment_size();
+	const unsigned int max_segment = i915_sg_segment_size(i915->drm.dev);
 	const size_t size = (size_t)ttm->num_pages << PAGE_SHIFT;
 	struct file *filp = i915_tt->filp;
 	struct sgt_iter sgt_iter;
@@ -538,7 +538,7 @@ static struct i915_refct_sgt *i915_ttm_tt_get_st(struct ttm_tt *ttm)
 	ret = sg_alloc_table_from_pages_segment(st,
 			ttm->pages, ttm->num_pages,
 			0, (unsigned long)ttm->num_pages << PAGE_SHIFT,
-			i915_sg_segment_size(), GFP_KERNEL);
+			i915_sg_segment_size(i915_tt->dev), GFP_KERNEL);
 	if (ret) {
 		st->sgl = NULL;
 		return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 8423df021b713..e4515d6acd43c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -129,7 +129,7 @@ static void i915_gem_object_userptr_drop_ref(struct drm_i915_gem_object *obj)
 static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 {
 	const unsigned long num_pages = obj->base.size >> PAGE_SHIFT;
-	unsigned int max_segment = i915_sg_segment_size();
+	unsigned int max_segment = i915_sg_segment_size(obj->base.dev->dev);
 	struct sg_table *st;
 	unsigned int sg_page_sizes;
 	struct page **pvec;
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915_scatterlist.h
index 9ddb3e743a3e5..c278888f71528 100644
--- a/drivers/gpu/drm/i915/i915_scatterlist.h
+++ b/drivers/gpu/drm/i915/i915_scatterlist.h
@@ -9,7 +9,8 @@
 
 #include <linux/pfn.h>
 #include <linux/scatterlist.h>
-#include <linux/swiotlb.h>
+#include <linux/dma-mapping.h>
+#include <xen/xen.h>
 
 #include "i915_gem.h"
 
@@ -127,19 +128,22 @@ static inline unsigned int i915_sg_dma_sizes(struct scatterlist *sg)
 	return page_sizes;
 }
 
-static inline unsigned int i915_sg_segment_size(void)
+static inline unsigned int i915_sg_segment_size(struct device *dev)
 {
-	unsigned int size = swiotlb_max_segment();
-
-	if (size == 0)
-		size = UINT_MAX;
-
-	size = rounddown(size, PAGE_SIZE);
-	/* swiotlb_max_segment_size can return 1 byte when it means one page. */
-	if (size < PAGE_SIZE)
-		size = PAGE_SIZE;
-
-	return size;
+	size_t max = min_t(size_t, UINT_MAX, dma_max_mapping_size(dev));
+
+	/*
+	 * Xen on x86 can reshuffle pages under us.  The DMA API takes
+	 * care of that both in dma_alloc_* (by calling into the hypervisor
+	 * to make the pages contigous) and in dma_map_* (by bounce buffering).
+	 * But i915 abuses ignores the coherency aspects of the DMA API and
+	 * thus can't cope with bounce buffering actually happening, so add
+	 * a hack here to force small allocations and mapping when running on
+	 * Xen.  (good luck with TDX, btw --hch)
+	 */
+	if (IS_ENABLED(CONFIG_X86) && xen_domain())
+		max = PAGE_SIZE;
+	return round_down(max, PAGE_SIZE);
 }
 
 bool i915_sg_trim(struct sg_table *orig_st);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
  2022-10-18  3:52 i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment" Marek Marczykowski-Górecki
  2022-10-18  8:24   ` [Intel-gfx] " Christoph Hellwig
@ 2022-10-18  8:31 ` Patchwork
  1 sibling, 0 replies; 21+ messages in thread
From: Patchwork @ 2022-10-18  8:31 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: intel-gfx

== Series Details ==

Series: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
URL   : https://patchwork.freedesktop.org/series/109817/
State : failure

== Summary ==

Error: patch https://patchwork.freedesktop.org/api/1.0/series/109817/revisions/1/mbox/ not applied
Applying: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
error: No valid patches in input (allow with "--allow-empty")
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
  2022-10-18  8:24   ` [Intel-gfx] " Christoph Hellwig
  (?)
@ 2022-10-18  8:57     ` Jan Beulich
  -1 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2022-10-18  8:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Konrad Rzeszutek Wilk, Anshuman Khandual, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, regressions, xen-devel,
	iommu, Robert Beckett, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, Tvrtko Ursulin, Matthew Auld, intel-gfx, dri-devel,
	Marek Marczykowski-Górecki

On 18.10.2022 10:24, Christoph Hellwig wrote:
> @@ -127,19 +128,22 @@ static inline unsigned int i915_sg_dma_sizes(struct scatterlist *sg)
>  	return page_sizes;
>  }
>  
> -static inline unsigned int i915_sg_segment_size(void)
> +static inline unsigned int i915_sg_segment_size(struct device *dev)
>  {
> -	unsigned int size = swiotlb_max_segment();
> -
> -	if (size == 0)
> -		size = UINT_MAX;
> -
> -	size = rounddown(size, PAGE_SIZE);
> -	/* swiotlb_max_segment_size can return 1 byte when it means one page. */
> -	if (size < PAGE_SIZE)
> -		size = PAGE_SIZE;
> -
> -	return size;
> +	size_t max = min_t(size_t, UINT_MAX, dma_max_mapping_size(dev));
> +
> +	/*
> +	 * Xen on x86 can reshuffle pages under us.  The DMA API takes
> +	 * care of that both in dma_alloc_* (by calling into the hypervisor
> +	 * to make the pages contigous) and in dma_map_* (by bounce buffering).
> +	 * But i915 abuses ignores the coherency aspects of the DMA API and
> +	 * thus can't cope with bounce buffering actually happening, so add
> +	 * a hack here to force small allocations and mapping when running on
> +	 * Xen.  (good luck with TDX, btw --hch)
> +	 */
> +	if (IS_ENABLED(CONFIG_X86) && xen_domain())
> +		max = PAGE_SIZE;
> +	return round_down(max, PAGE_SIZE);
>  }

Shouldn't this then be xen_pv_domain() that you use here, and - if you
really want IS_ENABLED() in addition - CONFIG_XEN_PV?

Jan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
@ 2022-10-18  8:57     ` Jan Beulich
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2022-10-18  8:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Juergen Gross, Robert Beckett, Stefano Stabellini, regressions,
	dri-devel, Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
	Marek Marczykowski-Górecki, Tvrtko Ursulin,
	Oleksandr Tyshchenko, iommu, Matthew Auld, Rodrigo Vivi,
	xen-devel

On 18.10.2022 10:24, Christoph Hellwig wrote:
> @@ -127,19 +128,22 @@ static inline unsigned int i915_sg_dma_sizes(struct scatterlist *sg)
>  	return page_sizes;
>  }
>  
> -static inline unsigned int i915_sg_segment_size(void)
> +static inline unsigned int i915_sg_segment_size(struct device *dev)
>  {
> -	unsigned int size = swiotlb_max_segment();
> -
> -	if (size == 0)
> -		size = UINT_MAX;
> -
> -	size = rounddown(size, PAGE_SIZE);
> -	/* swiotlb_max_segment_size can return 1 byte when it means one page. */
> -	if (size < PAGE_SIZE)
> -		size = PAGE_SIZE;
> -
> -	return size;
> +	size_t max = min_t(size_t, UINT_MAX, dma_max_mapping_size(dev));
> +
> +	/*
> +	 * Xen on x86 can reshuffle pages under us.  The DMA API takes
> +	 * care of that both in dma_alloc_* (by calling into the hypervisor
> +	 * to make the pages contigous) and in dma_map_* (by bounce buffering).
> +	 * But i915 abuses ignores the coherency aspects of the DMA API and
> +	 * thus can't cope with bounce buffering actually happening, so add
> +	 * a hack here to force small allocations and mapping when running on
> +	 * Xen.  (good luck with TDX, btw --hch)
> +	 */
> +	if (IS_ENABLED(CONFIG_X86) && xen_domain())
> +		max = PAGE_SIZE;
> +	return round_down(max, PAGE_SIZE);
>  }

Shouldn't this then be xen_pv_domain() that you use here, and - if you
really want IS_ENABLED() in addition - CONFIG_XEN_PV?

Jan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
@ 2022-10-18  8:57     ` Jan Beulich
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2022-10-18  8:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Juergen Gross, Stefano Stabellini, regressions, dri-devel,
	Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
	Marek Marczykowski-Górecki, Oleksandr Tyshchenko, iommu,
	Matthew Auld, Rodrigo Vivi, xen-devel

On 18.10.2022 10:24, Christoph Hellwig wrote:
> @@ -127,19 +128,22 @@ static inline unsigned int i915_sg_dma_sizes(struct scatterlist *sg)
>  	return page_sizes;
>  }
>  
> -static inline unsigned int i915_sg_segment_size(void)
> +static inline unsigned int i915_sg_segment_size(struct device *dev)
>  {
> -	unsigned int size = swiotlb_max_segment();
> -
> -	if (size == 0)
> -		size = UINT_MAX;
> -
> -	size = rounddown(size, PAGE_SIZE);
> -	/* swiotlb_max_segment_size can return 1 byte when it means one page. */
> -	if (size < PAGE_SIZE)
> -		size = PAGE_SIZE;
> -
> -	return size;
> +	size_t max = min_t(size_t, UINT_MAX, dma_max_mapping_size(dev));
> +
> +	/*
> +	 * Xen on x86 can reshuffle pages under us.  The DMA API takes
> +	 * care of that both in dma_alloc_* (by calling into the hypervisor
> +	 * to make the pages contigous) and in dma_map_* (by bounce buffering).
> +	 * But i915 abuses ignores the coherency aspects of the DMA API and
> +	 * thus can't cope with bounce buffering actually happening, so add
> +	 * a hack here to force small allocations and mapping when running on
> +	 * Xen.  (good luck with TDX, btw --hch)
> +	 */
> +	if (IS_ENABLED(CONFIG_X86) && xen_domain())
> +		max = PAGE_SIZE;
> +	return round_down(max, PAGE_SIZE);
>  }

Shouldn't this then be xen_pv_domain() that you use here, and - if you
really want IS_ENABLED() in addition - CONFIG_XEN_PV?

Jan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
  2022-10-18  8:57     ` Jan Beulich
@ 2022-10-18 11:02       ` Christoph Hellwig
  -1 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2022-10-18 11:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, regressions, dri-devel,
	Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
	Marek Marczykowski-Górecki, Oleksandr Tyshchenko, iommu,
	Matthew Auld, Rodrigo Vivi, xen-devel, Christoph Hellwig

On Tue, Oct 18, 2022 at 10:57:37AM +0200, Jan Beulich wrote:
> Shouldn't this then be xen_pv_domain() that you use here, and - if you
> really want IS_ENABLED() in addition - CONFIG_XEN_PV?

I'll need help from people that understand Xen better than me what
the exact conditions (and maybe also comments are).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
@ 2022-10-18 11:02       ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2022-10-18 11:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Christoph Hellwig, Konrad Rzeszutek Wilk, Anshuman Khandual,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	regressions, xen-devel, iommu, Robert Beckett, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Matthew Auld,
	intel-gfx, dri-devel, Marek Marczykowski-Górecki

On Tue, Oct 18, 2022 at 10:57:37AM +0200, Jan Beulich wrote:
> Shouldn't this then be xen_pv_domain() that you use here, and - if you
> really want IS_ENABLED() in addition - CONFIG_XEN_PV?

I'll need help from people that understand Xen better than me what
the exact conditions (and maybe also comments are).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
  2022-10-18  8:24   ` [Intel-gfx] " Christoph Hellwig
@ 2022-10-18 12:01     ` Marek Marczykowski-Górecki
  -1 siblings, 0 replies; 21+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-10-18 12:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Konrad Rzeszutek Wilk, Anshuman Khandual, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, regressions, xen-devel,
	iommu, Robert Beckett, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, Tvrtko Ursulin, Matthew Auld, intel-gfx, dri-devel

[-- Attachment #1: Type: text/plain, Size: 1217 bytes --]

On Tue, Oct 18, 2022 at 10:24:13AM +0200, Christoph Hellwig wrote:
> On Tue, Oct 18, 2022 at 05:52:16AM +0200, Marek Marczykowski-Górecki wrote:
> > not only) when using IGD in Xen PV dom0. After not very long time Xorg
> > crashes, and dmesg contain messages like this:
> > 
> >     i915 0000:00:02.0: [drm] GPU HANG: ecode 7:1:01fffbfe, in Xorg [5337]
> >     i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
> >     i915 0000:00:02.0: [drm] Xorg[5337] context reset due to GPU hang
> 
> <snip>
> 
> > I tried reverting just this commit on top of 6.0.x, but the context
> > changed significantly in subsequent commits, so after trying reverting
> > it together with 3 or 4 more commits I gave up.
> > 
> > What may be an important detail, the system heavily uses cross-VM shared
> > memory (gntdev) to map window contents from VMs. This is Qubes OS, and
> > it uses Xen 4.14.
> 
> Can you try the patch below?

Yes, this seems to help.

And since this is a PV domain, with Jan proposed change it should work
too. Intuitively this indeed should be PV-only thing (as was forcing
swiotlb before).

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
@ 2022-10-18 12:01     ` Marek Marczykowski-Górecki
  0 siblings, 0 replies; 21+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-10-18 12:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Juergen Gross, Robert Beckett, Stefano Stabellini, regressions,
	dri-devel, Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
	Tvrtko Ursulin, Oleksandr Tyshchenko, iommu, Matthew Auld,
	Rodrigo Vivi, xen-devel

[-- Attachment #1: Type: text/plain, Size: 1217 bytes --]

On Tue, Oct 18, 2022 at 10:24:13AM +0200, Christoph Hellwig wrote:
> On Tue, Oct 18, 2022 at 05:52:16AM +0200, Marek Marczykowski-Górecki wrote:
> > not only) when using IGD in Xen PV dom0. After not very long time Xorg
> > crashes, and dmesg contain messages like this:
> > 
> >     i915 0000:00:02.0: [drm] GPU HANG: ecode 7:1:01fffbfe, in Xorg [5337]
> >     i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
> >     i915 0000:00:02.0: [drm] Xorg[5337] context reset due to GPU hang
> 
> <snip>
> 
> > I tried reverting just this commit on top of 6.0.x, but the context
> > changed significantly in subsequent commits, so after trying reverting
> > it together with 3 or 4 more commits I gave up.
> > 
> > What may be an important detail, the system heavily uses cross-VM shared
> > memory (gntdev) to map window contents from VMs. This is Qubes OS, and
> > it uses Xen 4.14.
> 
> Can you try the patch below?

Yes, this seems to help.

And since this is a PV domain, with Jan proposed change it should work
too. Intuitively this indeed should be PV-only thing (as was forcing
swiotlb before).

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
  2022-10-18 11:02       ` Christoph Hellwig
  (?)
@ 2022-10-18 14:21         ` Jan Beulich
  -1 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2022-10-18 14:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Konrad Rzeszutek Wilk, Anshuman Khandual, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, regressions, xen-devel,
	iommu, Robert Beckett, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, Tvrtko Ursulin, Matthew Auld, intel-gfx, dri-devel,
	Marek Marczykowski-Górecki

On 18.10.2022 13:02, Christoph Hellwig wrote:
> On Tue, Oct 18, 2022 at 10:57:37AM +0200, Jan Beulich wrote:
>> Shouldn't this then be xen_pv_domain() that you use here, and - if you
>> really want IS_ENABLED() in addition - CONFIG_XEN_PV?
> 
> I'll need help from people that understand Xen better than me what
> the exact conditions (and maybe also comments are).

Leaving the "i915 abuses" part aside (because I can't tell what exactly the
abuse is), but assuming that "can't cope with bounce buffering" means they
don't actually use the allocated buffers, I'd suggest this:

	/*
	 * For Xen PV guests pages aren't contiguous in DMA (machine) address
	 * space.  The DMA API takes care of that both in dma_alloc_* (by
	 * calling into the hypervisor to make the pages contiguous) and in
	 * dma_map_* (by bounce buffering).  But i915 abuses ignores the
	 * coherency aspects of the DMA API and thus can't cope with bounce
	 * buffering actually happening, so add a hack here to force small
	 * allocations and mappings when running in PV mode on Xen.
	 */
	if (IS_ENABLED(CONFIG_XEN_PV) && xen_pv_domain())
		max = PAGE_SIZE;

I've dropped the TDX related remark because I don't think it's meaningful
for PV guests. Otoh I've left the "abuses ignores" word sequence as is, no
matter that it reads odd to me. Plus, as hinted at before, I'm not
convinced the IS_ENABLED() use is actually necessary or warranted here.

Jan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
@ 2022-10-18 14:21         ` Jan Beulich
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2022-10-18 14:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Juergen Gross, Robert Beckett, Stefano Stabellini, regressions,
	dri-devel, Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
	Marek Marczykowski-Górecki, Tvrtko Ursulin,
	Oleksandr Tyshchenko, iommu, Matthew Auld, Rodrigo Vivi,
	xen-devel

On 18.10.2022 13:02, Christoph Hellwig wrote:
> On Tue, Oct 18, 2022 at 10:57:37AM +0200, Jan Beulich wrote:
>> Shouldn't this then be xen_pv_domain() that you use here, and - if you
>> really want IS_ENABLED() in addition - CONFIG_XEN_PV?
> 
> I'll need help from people that understand Xen better than me what
> the exact conditions (and maybe also comments are).

Leaving the "i915 abuses" part aside (because I can't tell what exactly the
abuse is), but assuming that "can't cope with bounce buffering" means they
don't actually use the allocated buffers, I'd suggest this:

	/*
	 * For Xen PV guests pages aren't contiguous in DMA (machine) address
	 * space.  The DMA API takes care of that both in dma_alloc_* (by
	 * calling into the hypervisor to make the pages contiguous) and in
	 * dma_map_* (by bounce buffering).  But i915 abuses ignores the
	 * coherency aspects of the DMA API and thus can't cope with bounce
	 * buffering actually happening, so add a hack here to force small
	 * allocations and mappings when running in PV mode on Xen.
	 */
	if (IS_ENABLED(CONFIG_XEN_PV) && xen_pv_domain())
		max = PAGE_SIZE;

I've dropped the TDX related remark because I don't think it's meaningful
for PV guests. Otoh I've left the "abuses ignores" word sequence as is, no
matter that it reads odd to me. Plus, as hinted at before, I'm not
convinced the IS_ENABLED() use is actually necessary or warranted here.

Jan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
@ 2022-10-18 14:21         ` Jan Beulich
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2022-10-18 14:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Juergen Gross, Stefano Stabellini, regressions, dri-devel,
	Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
	Marek Marczykowski-Górecki, Oleksandr Tyshchenko, iommu,
	Matthew Auld, Rodrigo Vivi, xen-devel

On 18.10.2022 13:02, Christoph Hellwig wrote:
> On Tue, Oct 18, 2022 at 10:57:37AM +0200, Jan Beulich wrote:
>> Shouldn't this then be xen_pv_domain() that you use here, and - if you
>> really want IS_ENABLED() in addition - CONFIG_XEN_PV?
> 
> I'll need help from people that understand Xen better than me what
> the exact conditions (and maybe also comments are).

Leaving the "i915 abuses" part aside (because I can't tell what exactly the
abuse is), but assuming that "can't cope with bounce buffering" means they
don't actually use the allocated buffers, I'd suggest this:

	/*
	 * For Xen PV guests pages aren't contiguous in DMA (machine) address
	 * space.  The DMA API takes care of that both in dma_alloc_* (by
	 * calling into the hypervisor to make the pages contiguous) and in
	 * dma_map_* (by bounce buffering).  But i915 abuses ignores the
	 * coherency aspects of the DMA API and thus can't cope with bounce
	 * buffering actually happening, so add a hack here to force small
	 * allocations and mappings when running in PV mode on Xen.
	 */
	if (IS_ENABLED(CONFIG_XEN_PV) && xen_pv_domain())
		max = PAGE_SIZE;

I've dropped the TDX related remark because I don't think it's meaningful
for PV guests. Otoh I've left the "abuses ignores" word sequence as is, no
matter that it reads odd to me. Plus, as hinted at before, I'm not
convinced the IS_ENABLED() use is actually necessary or warranted here.

Jan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
  2022-10-18 14:21         ` Jan Beulich
@ 2022-10-18 14:33           ` Christoph Hellwig
  -1 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2022-10-18 14:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Christoph Hellwig, Konrad Rzeszutek Wilk, Anshuman Khandual,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	regressions, xen-devel, iommu, Robert Beckett, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Matthew Auld,
	intel-gfx, dri-devel, Marek Marczykowski-Górecki

On Tue, Oct 18, 2022 at 04:21:43PM +0200, Jan Beulich wrote:
> Leaving the "i915 abuses" part aside (because I can't tell what exactly the
> abuse is), but assuming that "can't cope with bounce buffering" means they
> don't actually use the allocated buffers, I'd suggest this:

Except for one odd place i915 never uses dma_alloc_* but always allocates
memory itself and then maps it, but then treats it as if it was a
dma_alloc_coherent allocations, that is never does ownership changes.

> I've dropped the TDX related remark because I don't think it's meaningful
> for PV guests.

This remark is for TDX in general, not Xen related.  With TDX and other
confidentatial computing schemes, all DMA must be bounce buffered, and
all drivers skipping dma_sync* calls are broken.

> Otoh I've left the "abuses ignores" word sequence as is, no
> matter that it reads odd to me. Plus, as hinted at before, I'm not
> convinced the IS_ENABLED() use is actually necessary or warranted here.

If we don't need the IS_ENABLED is not needed I'm all for dropping it.
But unless I misread the code, on arm/arm64 even PV guests are 1:1
mapped so that all Linux physically contigous memory also is Xen
contigous, so we don't need the hack.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
@ 2022-10-18 14:33           ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2022-10-18 14:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, regressions, dri-devel,
	Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
	Marek Marczykowski-Górecki, Oleksandr Tyshchenko, iommu,
	Matthew Auld, Rodrigo Vivi, xen-devel, Christoph Hellwig

On Tue, Oct 18, 2022 at 04:21:43PM +0200, Jan Beulich wrote:
> Leaving the "i915 abuses" part aside (because I can't tell what exactly the
> abuse is), but assuming that "can't cope with bounce buffering" means they
> don't actually use the allocated buffers, I'd suggest this:

Except for one odd place i915 never uses dma_alloc_* but always allocates
memory itself and then maps it, but then treats it as if it was a
dma_alloc_coherent allocations, that is never does ownership changes.

> I've dropped the TDX related remark because I don't think it's meaningful
> for PV guests.

This remark is for TDX in general, not Xen related.  With TDX and other
confidentatial computing schemes, all DMA must be bounce buffered, and
all drivers skipping dma_sync* calls are broken.

> Otoh I've left the "abuses ignores" word sequence as is, no
> matter that it reads odd to me. Plus, as hinted at before, I'm not
> convinced the IS_ENABLED() use is actually necessary or warranted here.

If we don't need the IS_ENABLED is not needed I'm all for dropping it.
But unless I misread the code, on arm/arm64 even PV guests are 1:1
mapped so that all Linux physically contigous memory also is Xen
contigous, so we don't need the hack.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
  2022-10-18 14:33           ` [Intel-gfx] " Christoph Hellwig
  (?)
@ 2022-10-18 14:53             ` Juergen Gross
  -1 siblings, 0 replies; 21+ messages in thread
From: Juergen Gross @ 2022-10-18 14:53 UTC (permalink / raw)
  To: Christoph Hellwig, Jan Beulich
  Cc: Konrad Rzeszutek Wilk, Anshuman Khandual, Stefano Stabellini,
	Oleksandr Tyshchenko, regressions, xen-devel, iommu,
	Robert Beckett, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	Tvrtko Ursulin, Matthew Auld, intel-gfx, dri-devel,
	Marek Marczykowski-Górecki


[-- Attachment #1.1.1: Type: text/plain, Size: 1386 bytes --]

On 18.10.22 16:33, Christoph Hellwig wrote:
> On Tue, Oct 18, 2022 at 04:21:43PM +0200, Jan Beulich wrote:
>> Leaving the "i915 abuses" part aside (because I can't tell what exactly the
>> abuse is), but assuming that "can't cope with bounce buffering" means they
>> don't actually use the allocated buffers, I'd suggest this:
> 
> Except for one odd place i915 never uses dma_alloc_* but always allocates
> memory itself and then maps it, but then treats it as if it was a
> dma_alloc_coherent allocations, that is never does ownership changes.
> 
>> I've dropped the TDX related remark because I don't think it's meaningful
>> for PV guests.
> 
> This remark is for TDX in general, not Xen related.  With TDX and other
> confidentatial computing schemes, all DMA must be bounce buffered, and
> all drivers skipping dma_sync* calls are broken.
> 
>> Otoh I've left the "abuses ignores" word sequence as is, no
>> matter that it reads odd to me. Plus, as hinted at before, I'm not
>> convinced the IS_ENABLED() use is actually necessary or warranted here.
> 
> If we don't need the IS_ENABLED is not needed I'm all for dropping it.
> But unless I misread the code, on arm/arm64 even PV guests are 1:1
> mapped so that all Linux physically contigous memory also is Xen
> contigous, so we don't need the hack.

There are no PV guests on arm/arm64.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
@ 2022-10-18 14:53             ` Juergen Gross
  0 siblings, 0 replies; 21+ messages in thread
From: Juergen Gross @ 2022-10-18 14:53 UTC (permalink / raw)
  To: Christoph Hellwig, Jan Beulich
  Cc: Robert Beckett, Tvrtko Ursulin, Stefano Stabellini, regressions,
	dri-devel, Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
	Marek Marczykowski-Górecki, Oleksandr Tyshchenko, iommu,
	Matthew Auld, Rodrigo Vivi, xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 1386 bytes --]

On 18.10.22 16:33, Christoph Hellwig wrote:
> On Tue, Oct 18, 2022 at 04:21:43PM +0200, Jan Beulich wrote:
>> Leaving the "i915 abuses" part aside (because I can't tell what exactly the
>> abuse is), but assuming that "can't cope with bounce buffering" means they
>> don't actually use the allocated buffers, I'd suggest this:
> 
> Except for one odd place i915 never uses dma_alloc_* but always allocates
> memory itself and then maps it, but then treats it as if it was a
> dma_alloc_coherent allocations, that is never does ownership changes.
> 
>> I've dropped the TDX related remark because I don't think it's meaningful
>> for PV guests.
> 
> This remark is for TDX in general, not Xen related.  With TDX and other
> confidentatial computing schemes, all DMA must be bounce buffered, and
> all drivers skipping dma_sync* calls are broken.
> 
>> Otoh I've left the "abuses ignores" word sequence as is, no
>> matter that it reads odd to me. Plus, as hinted at before, I'm not
>> convinced the IS_ENABLED() use is actually necessary or warranted here.
> 
> If we don't need the IS_ENABLED is not needed I'm all for dropping it.
> But unless I misread the code, on arm/arm64 even PV guests are 1:1
> mapped so that all Linux physically contigous memory also is Xen
> contigous, so we don't need the hack.

There are no PV guests on arm/arm64.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
@ 2022-10-18 14:53             ` Juergen Gross
  0 siblings, 0 replies; 21+ messages in thread
From: Juergen Gross @ 2022-10-18 14:53 UTC (permalink / raw)
  To: Christoph Hellwig, Jan Beulich
  Cc: Stefano Stabellini, regressions, dri-devel, Anshuman Khandual,
	intel-gfx, Konrad Rzeszutek Wilk,
	Marek Marczykowski-Górecki, Oleksandr Tyshchenko, iommu,
	Matthew Auld, Rodrigo Vivi, xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 1386 bytes --]

On 18.10.22 16:33, Christoph Hellwig wrote:
> On Tue, Oct 18, 2022 at 04:21:43PM +0200, Jan Beulich wrote:
>> Leaving the "i915 abuses" part aside (because I can't tell what exactly the
>> abuse is), but assuming that "can't cope with bounce buffering" means they
>> don't actually use the allocated buffers, I'd suggest this:
> 
> Except for one odd place i915 never uses dma_alloc_* but always allocates
> memory itself and then maps it, but then treats it as if it was a
> dma_alloc_coherent allocations, that is never does ownership changes.
> 
>> I've dropped the TDX related remark because I don't think it's meaningful
>> for PV guests.
> 
> This remark is for TDX in general, not Xen related.  With TDX and other
> confidentatial computing schemes, all DMA must be bounce buffered, and
> all drivers skipping dma_sync* calls are broken.
> 
>> Otoh I've left the "abuses ignores" word sequence as is, no
>> matter that it reads odd to me. Plus, as hinted at before, I'm not
>> convinced the IS_ENABLED() use is actually necessary or warranted here.
> 
> If we don't need the IS_ENABLED is not needed I'm all for dropping it.
> But unless I misread the code, on arm/arm64 even PV guests are 1:1
> mapped so that all Linux physically contigous memory also is Xen
> contigous, so we don't need the hack.

There are no PV guests on arm/arm64.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
  2022-10-18 14:53             ` Juergen Gross
@ 2022-10-18 14:55               ` Christoph Hellwig
  -1 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2022-10-18 14:55 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Christoph Hellwig, Jan Beulich, Konrad Rzeszutek Wilk,
	Anshuman Khandual, Stefano Stabellini, Oleksandr Tyshchenko,
	regressions, xen-devel, iommu, Robert Beckett, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Matthew Auld,
	intel-gfx, dri-devel, Marek Marczykowski-Górecki

On Tue, Oct 18, 2022 at 04:53:50PM +0200, Juergen Gross wrote:
>> If we don't need the IS_ENABLED is not needed I'm all for dropping it.
>> But unless I misread the code, on arm/arm64 even PV guests are 1:1
>> mapped so that all Linux physically contigous memory also is Xen
>> contigous, so we don't need the hack.
>
> There are no PV guests on arm/arm64.

Ok, that's the part I was missing.  In that case we should be fine
without the IS_ENABLED indeed.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
@ 2022-10-18 14:55               ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2022-10-18 14:55 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Stefano Stabellini, regressions, dri-devel, Anshuman Khandual,
	intel-gfx, Konrad Rzeszutek Wilk,
	Marek Marczykowski-Górecki, Oleksandr Tyshchenko, iommu,
	Matthew Auld, Jan Beulich, Rodrigo Vivi, xen-devel,
	Christoph Hellwig

On Tue, Oct 18, 2022 at 04:53:50PM +0200, Juergen Gross wrote:
>> If we don't need the IS_ENABLED is not needed I'm all for dropping it.
>> But unless I misread the code, on arm/arm64 even PV guests are 1:1
>> mapped so that all Linux physically contigous memory also is Xen
>> contigous, so we don't need the hack.
>
> There are no PV guests on arm/arm64.

Ok, that's the part I was missing.  In that case we should be fine
without the IS_ENABLED indeed.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2022-10-24 13:24 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-18  3:52 i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment" Marek Marczykowski-Górecki
2022-10-18  8:24 ` Christoph Hellwig
2022-10-18  8:24   ` [Intel-gfx] " Christoph Hellwig
2022-10-18  8:57   ` Jan Beulich
2022-10-18  8:57     ` [Intel-gfx] " Jan Beulich
2022-10-18  8:57     ` Jan Beulich
2022-10-18 11:02     ` [Intel-gfx] " Christoph Hellwig
2022-10-18 11:02       ` Christoph Hellwig
2022-10-18 14:21       ` Jan Beulich
2022-10-18 14:21         ` [Intel-gfx] " Jan Beulich
2022-10-18 14:21         ` Jan Beulich
2022-10-18 14:33         ` Christoph Hellwig
2022-10-18 14:33           ` [Intel-gfx] " Christoph Hellwig
2022-10-18 14:53           ` Juergen Gross
2022-10-18 14:53             ` [Intel-gfx] " Juergen Gross
2022-10-18 14:53             ` Juergen Gross
2022-10-18 14:55             ` Christoph Hellwig
2022-10-18 14:55               ` [Intel-gfx] " Christoph Hellwig
2022-10-18 12:01   ` Marek Marczykowski-Górecki
2022-10-18 12:01     ` Marek Marczykowski-Górecki
2022-10-18  8:31 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.