All of lore.kernel.org
 help / color / mirror / Atom feed
* Make staging driver stable for SRIOV VF (1)
@ 2017-10-13  8:26 Pixel Ding
       [not found] ` <1507883180-5626-1-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Pixel Ding @ 2017-10-13  8:26 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Bingley.Li-5C7GfCeVMHo

This is the first patch series to make latest staging driver
stable for SRIOV VF on both Tonga and Vega. Patches are merged
from SRIOV branches or reimplemented, including bug fixes and
small features requested by SRIOV users.  

Please help reviewing, Thanks.

[PATCH 1/4] drm/amdgpu: always consider virualised device for
[PATCH 2/4] drm/amdgpu: workaround for VM fault caused by SDMA
[PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
[PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 1/4] drm/amdgpu: always consider virualised device for checking post
       [not found] ` <1507883180-5626-1-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-13  8:26   ` Pixel Ding
  2017-10-13  8:26   ` [PATCH 2/4] drm/amdgpu: workaround for VM fault caused by SDMA set_wptr Pixel Ding
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 32+ messages in thread
From: Pixel Ding @ 2017-10-13  8:26 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: pding, Bingley.Li-5C7GfCeVMHo

From: pding <Pixel.Ding@amd.com>

The post checking on scratch registers isn't reliable for virtual
function.

Signed-off-by: pding <Pixel.Ding@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 683965b..ab8f0d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -669,7 +669,7 @@ void amdgpu_gart_location(struct amdgpu_device *adev, struct amdgpu_mc *mc)
  * or post is needed if  hw reset is performed.
  * Returns true if need or false if not.
  */
-bool amdgpu_need_post(struct amdgpu_device *adev)
+static bool amdgpu_check_post(struct amdgpu_device *adev)
 {
 	uint32_t reg;
 
@@ -692,7 +692,7 @@ bool amdgpu_need_post(struct amdgpu_device *adev)
 
 }
 
-static bool amdgpu_vpost_needed(struct amdgpu_device *adev)
+bool amdgpu_need_post(struct amdgpu_device *adev)
 {
 	if (amdgpu_sriov_vf(adev))
 		return false;
@@ -716,7 +716,7 @@ static bool amdgpu_vpost_needed(struct amdgpu_device *adev)
 				return true;
 		}
 	}
-	return amdgpu_need_post(adev);
+	return amdgpu_check_post(adev);
 }
 
 /**
@@ -2208,7 +2208,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	amdgpu_device_detect_sriov_bios(adev);
 
 	/* Post card if necessary */
-	if (amdgpu_vpost_needed(adev)) {
+	if (amdgpu_need_post(adev)) {
 		if (!adev->bios) {
 			dev_err(adev->dev, "no vBIOS found\n");
 			amdgpu_vf_error_put(AMDGIM_ERROR_VF_NO_VBIOS, 0, 0);
-- 
2.9.5

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 2/4] drm/amdgpu: workaround for VM fault caused by SDMA set_wptr
       [not found] ` <1507883180-5626-1-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
  2017-10-13  8:26   ` [PATCH 1/4] drm/amdgpu: always consider virualised device for checking post Pixel Ding
@ 2017-10-13  8:26   ` Pixel Ding
       [not found]     ` <1507883180-5626-3-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
  2017-10-13  8:26   ` [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info Pixel Ding
  2017-10-13  8:26   ` [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing Pixel Ding
  3 siblings, 1 reply; 32+ messages in thread
From: Pixel Ding @ 2017-10-13  8:26 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: pding, Bingley.Li-5C7GfCeVMHo

From: pding <Pixel.Ding@amd.com>

The polling memory was standalone in VRAM before, so the HDP flush
introduced latency that hides a VM fault issue. Now polling memory
leverages the WB in system memory and HDP flush is not required, the
VM fault at same page happens.

Add delay back to workaround until the root cause is found.

Tests: VP1 or launch 40 instances of glxinfo at the same time.

Signed-off-by: pding <Pixel.Ding@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index b1de44f..5c4bbe1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -381,6 +381,9 @@ static void sdma_v3_0_ring_set_wptr(struct amdgpu_ring *ring)
 	if (ring->use_doorbell) {
 		/* XXX check if swapping is necessary on BE */
 		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr) << 2;
+		/* workaround: VM fault always happen at page 2046 */
+		if (amdgpu_sriov_vf(adev))
+			udelay(500);
 		WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2);
 	} else {
 		int me = (ring == &ring->adev->sdma.instance[0].ring) ? 0 : 1;
-- 
2.9.5

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found] ` <1507883180-5626-1-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
  2017-10-13  8:26   ` [PATCH 1/4] drm/amdgpu: always consider virualised device for checking post Pixel Ding
  2017-10-13  8:26   ` [PATCH 2/4] drm/amdgpu: workaround for VM fault caused by SDMA set_wptr Pixel Ding
@ 2017-10-13  8:26   ` Pixel Ding
       [not found]     ` <1507883180-5626-4-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
  2017-10-13  8:26   ` [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing Pixel Ding
  3 siblings, 1 reply; 32+ messages in thread
From: Pixel Ding @ 2017-10-13  8:26 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: pding, Bingley.Li-5C7GfCeVMHo

From: pding <Pixel.Ding@amd.com>

Only report fence for GFX ring. This can help checking MCBP feature.

Signed-off-by: pding <Pixel.Ding@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 09d5a5c..2044758 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
 			   atomic_read(&ring->fence_drv.last_seq));
 		seq_printf(m, "Last emitted        0x%08x\n",
 			   ring->fence_drv.sync_seq);
+
+		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
+			break;
+
+		seq_printf(m, "Last preempted      0x%08x\n",
+			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
+
 	}
 	return 0;
 }
-- 
2.9.5

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
       [not found] ` <1507883180-5626-1-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-10-13  8:26   ` [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info Pixel Ding
@ 2017-10-13  8:26   ` Pixel Ding
       [not found]     ` <1507883180-5626-5-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
  3 siblings, 1 reply; 32+ messages in thread
From: Pixel Ding @ 2017-10-13  8:26 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: pding, Bingley.Li-5C7GfCeVMHo

From: pding <Pixel.Ding@amd.com>

Register accessing is performed when IRQ is disabled. Never sleep in
this function.

Known issue: dead sleep in many use cases of index/data registers.

Signed-off-by: pding <Pixel.Ding@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  8 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 55 ++++++++++++++++++------------
 6 files changed, 40 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ca212ef..f9de756 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -886,6 +886,7 @@ struct amdgpu_kiq {
 	u64			eop_gpu_addr;
 	struct amdgpu_bo	*eop_obj;
 	struct mutex		ring_mutex;
+	spinlock_t              ring_lock;
 	struct amdgpu_ring	ring;
 	struct amdgpu_irq_src	irq;
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ab8f0d6..1197274 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -109,10 +109,8 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,
 {
 	uint32_t ret;
 
-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
-		BUG_ON(in_interrupt());
+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
 		return amdgpu_virt_kiq_rreg(adev, reg);
-	}
 
 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
 		ret = readl(((void __iomem *)adev->rmmio) + (reg * 4));
@@ -137,10 +135,8 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
 		adev->last_mm_index = v;
 	}
 
-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
-		BUG_ON(in_interrupt());
+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
 		return amdgpu_virt_kiq_wreg(adev, reg, v);
-	}
 
 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
 		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 2044758..c6c7bf3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -109,7 +109,7 @@ static void amdgpu_fence_write(struct amdgpu_ring *ring, u32 seq)
  * Reads a fence value from memory (all asics).
  * Returns the value of the fence read from memory.
  */
-static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
+u32 amdgpu_fence_read(struct amdgpu_ring *ring)
 {
 	struct amdgpu_fence_driver *drv = &ring->fence_drv;
 	u32 seq = 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 4f6c68f..46fa88c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -186,6 +186,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
 	int r = 0;
 
 	mutex_init(&kiq->ring_mutex);
+	spin_lock_init(&kiq->ring_lock);
 
 	r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
 	if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index af8e544..a4fa923 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -89,6 +89,7 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
 void amdgpu_fence_driver_suspend(struct amdgpu_device *adev);
 void amdgpu_fence_driver_resume(struct amdgpu_device *adev);
 int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence);
+u32 amdgpu_fence_read(struct amdgpu_ring *ring);
 void amdgpu_fence_process(struct amdgpu_ring *ring);
 int amdgpu_fence_wait_empty(struct amdgpu_ring *ring);
 unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index ab05121..9757df1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -22,7 +22,7 @@
  */
 
 #include "amdgpu.h"
-#define MAX_KIQ_REG_WAIT	100000
+#define MAX_KIQ_REG_WAIT	100000000 /* in usecs */
 
 int amdgpu_allocate_static_csa(struct amdgpu_device *adev)
 {
@@ -113,28 +113,32 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
 
 uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
 {
-	signed long r;
-	uint32_t val;
-	struct dma_fence *f;
+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
+	unsigned long flags;
+	uint32_t val, seq, wait_seq;
 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
 	struct amdgpu_ring *ring = &kiq->ring;
 
 	BUG_ON(!ring->funcs->emit_rreg);
 
-	mutex_lock(&kiq->ring_mutex);
+	spin_lock_irqsave(&kiq->ring_lock, flags);
 	amdgpu_ring_alloc(ring, 32);
 	amdgpu_ring_emit_rreg(ring, reg);
-	amdgpu_fence_emit(ring, &f);
+	wait_seq = ++ring->fence_drv.sync_seq;
+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
 	amdgpu_ring_commit(ring);
-	mutex_unlock(&kiq->ring_mutex);
-
-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
-	dma_fence_put(f);
-	if (r < 1) {
-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
+	do {
+		seq = amdgpu_fence_read(ring);
+		udelay(5);
+		timeout -= 5;
+	} while (seq < wait_seq && timeout > 0);
+
+	if (timeout < 1) {
+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
 		return ~0;
 	}
-
 	val = adev->wb.wb[adev->virt.reg_val_offs];
 
 	return val;
@@ -142,24 +146,33 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
 
 void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)
 {
-	signed long r;
+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
+	unsigned long flags;
+	uint32_t seq, wait_seq;
 	struct dma_fence *f;
 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
 	struct amdgpu_ring *ring = &kiq->ring;
 
 	BUG_ON(!ring->funcs->emit_wreg);
 
-	mutex_lock(&kiq->ring_mutex);
+	spin_lock_irqsave(&kiq->ring_lock, flags);
 	amdgpu_ring_alloc(ring, 32);
 	amdgpu_ring_emit_wreg(ring, reg, v);
-	amdgpu_fence_emit(ring, &f);
+	/* amdgpu_fence_emit(ring, &f); */
+	wait_seq = ++ring->fence_drv.sync_seq;
+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
 	amdgpu_ring_commit(ring);
-	mutex_unlock(&kiq->ring_mutex);
+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
+
+	do {
+		seq = amdgpu_fence_read(ring);
+		udelay(5);
+		timeout -= 5;
+	} while (seq < wait_seq && timeout > 0);
 
-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
-	if (r < 1)
-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
-	dma_fence_put(f);
+	if (timeout < 1)
+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
 }
 
 /**
-- 
2.9.5

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]     ` <1507883180-5626-4-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-13  8:34       ` Christian König
       [not found]         ` <7be3d78b-7e68-9f1e-b72f-41d1f0a504c0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Christian König @ 2017-10-13  8:34 UTC (permalink / raw)
  To: Pixel Ding, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Bingley.Li-5C7GfCeVMHo

Am 13.10.2017 um 10:26 schrieb Pixel Ding:
> From: pding <Pixel.Ding@amd.com>
>
> Only report fence for GFX ring. This can help checking MCBP feature.
>
> Signed-off-by: pding <Pixel.Ding@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>   1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index 09d5a5c..2044758 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>   			   atomic_read(&ring->fence_drv.last_seq));
>   		seq_printf(m, "Last emitted        0x%08x\n",
>   			   ring->fence_drv.sync_seq);
> +
> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
> +			break;

That should probably be "continue" instead of break, or otherwise you 
don't print the other fences any more.

> +
> +		seq_printf(m, "Last preempted      0x%08x\n",
> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));

Is the code to put the preemption fence there already upstream?

If yes do we really do this like that for all supported generations?

Regards,
Christian.

> +
>   	}
>   	return 0;
>   }


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]         ` <7be3d78b-7e68-9f1e-b72f-41d1f0a504c0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-10-13  8:41           ` Ding, Pixel
       [not found]             ` <6CF4468F-BB11-4E36-91AB-756A6F4B61B3-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Ding, Pixel @ 2017-10-13  8:41 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Li, Bingley

Thanks Christian,

I’m not sure if I get your point, but yes the preemption fence offset could be changed.

Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.


— 
Sincerely Yours,
Pixel







On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:

>Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>> From: pding <Pixel.Ding@amd.com>
>>
>> Only report fence for GFX ring. This can help checking MCBP feature.
>>
>> Signed-off-by: pding <Pixel.Ding@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>> index 09d5a5c..2044758 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>   			   atomic_read(&ring->fence_drv.last_seq));
>>   		seq_printf(m, "Last emitted        0x%08x\n",
>>   			   ring->fence_drv.sync_seq);
>> +
>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>> +			break;
>
>That should probably be "continue" instead of break, or otherwise you 
>don't print the other fences any more.
>
>> +
>> +		seq_printf(m, "Last preempted      0x%08x\n",
>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>
>Is the code to put the preemption fence there already upstream?
>
>If yes do we really do this like that for all supported generations?
>
>Regards,
>Christian.
>
>> +
>>   	}
>>   	return 0;
>>   }
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]             ` <6CF4468F-BB11-4E36-91AB-756A6F4B61B3-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-13  8:51               ` Christian König
       [not found]                 ` <95665da9-4204-739c-8b69-ceec64958be0-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Christian König @ 2017-10-13  8:51 UTC (permalink / raw)
  To: Ding, Pixel, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Li, Bingley

> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
The question is where is this code for Tonga and Vega? I can't find a 
reference to fence_offs in the GFX8 nor GFX9 code we have in 
amd-staging-drm-next.

And if it doesn't depend on the fence_offs in the ring printing it like 
this is just pure speculation.

Regards,
Christian.

Am 13.10.2017 um 10:41 schrieb Ding, Pixel:
> Thanks Christian,
>
> I’m not sure if I get your point, but yes the preemption fence offset could be changed.
>
> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>
>
> —
> Sincerely Yours,
> Pixel
>
>
>
>
>
>
>
> On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>
>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>> From: pding <Pixel.Ding@amd.com>
>>>
>>> Only report fence for GFX ring. This can help checking MCBP feature.
>>>
>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>>    1 file changed, 7 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>> index 09d5a5c..2044758 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>>    			   atomic_read(&ring->fence_drv.last_seq));
>>>    		seq_printf(m, "Last emitted        0x%08x\n",
>>>    			   ring->fence_drv.sync_seq);
>>> +
>>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>>> +			break;
>> That should probably be "continue" instead of break, or otherwise you
>> don't print the other fences any more.
>>
>>> +
>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>> Is the code to put the preemption fence there already upstream?
>>
>> If yes do we really do this like that for all supported generations?
>>
>> Regards,
>> Christian.
>>
>>> +
>>>    	}
>>>    	return 0;
>>>    }
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]                 ` <95665da9-4204-739c-8b69-ceec64958be0-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-13  9:03                   ` Ding, Pixel
       [not found]                     ` <CDFBCA9E-4A71-4E40-A41B-1986811903C6-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Ding, Pixel @ 2017-10-13  9:03 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Li, Bingley

[-- Attachment #1: Type: text/plain, Size: 2778 bytes --]

My understanding is that CP will write seqno back to preempted fence offset when preemption occurred. Then if there is a value here we can generally know packet with which fence is preempted. Should driver handle any other thing for this?
					
				
			
		
	


— 
Sincerely Yours,
Pixel







On 13/10/2017, 4:51 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:

>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>The question is where is this code for Tonga and Vega? I can't find a 
>reference to fence_offs in the GFX8 nor GFX9 code we have in 
>amd-staging-drm-next.
>
>And if it doesn't depend on the fence_offs in the ring printing it like 
>this is just pure speculation.
>
>Regards,
>Christian.
>
>Am 13.10.2017 um 10:41 schrieb Ding, Pixel:
>> Thanks Christian,
>>
>> I’m not sure if I get your point, but yes the preemption fence offset could be changed.
>>
>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>
>>
>> —
>> Sincerely Yours,
>> Pixel
>>
>>
>>
>>
>>
>>
>>
>> On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>>
>>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>>> From: pding <Pixel.Ding@amd.com>
>>>>
>>>> Only report fence for GFX ring. This can help checking MCBP feature.
>>>>
>>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>>>    1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>> index 09d5a5c..2044758 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>>>    			   atomic_read(&ring->fence_drv.last_seq));
>>>>    		seq_printf(m, "Last emitted        0x%08x\n",
>>>>    			   ring->fence_drv.sync_seq);
>>>> +
>>>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>>>> +			break;
>>> That should probably be "continue" instead of break, or otherwise you
>>> don't print the other fences any more.
>>>
>>>> +
>>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>> Is the code to put the preemption fence there already upstream?
>>>
>>> If yes do we really do this like that for all supported generations?
>>>
>>> Regards,
>>> Christian.
>>>
>>>> +
>>>>    	}
>>>>    	return 0;
>>>>    }
>>>
>

[-- Attachment #2: default.png --]
[-- Type: image/png, Size: 86 bytes --]

[-- Attachment #3: default[CIwH].png --]
[-- Type: image/png, Size: 86 bytes --]

[-- Attachment #4: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]                     ` <CDFBCA9E-4A71-4E40-A41B-1986811903C6-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-13  9:09                       ` Ding, Pixel
  2017-10-13  9:10                       ` Christian König
  1 sibling, 0 replies; 32+ messages in thread
From: Ding, Pixel @ 2017-10-13  9:09 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Li, Bingley

This patch is not for implementation of MCBP handled by driver itself. In SRIOV use case the preemption is handle in host layer. What I do here is just check if the preemption occurs, so there’s no code to handle it.

—
Sincerely Yours,
Pixel







On 13/10/2017, 5:03 PM, "Ding, Pixel" <Pixel.Ding@amd.com> wrote:

>My understanding is that CP will write seqno back to preempted fence offset when preemption occurred. Then if there is a value here we can generally know packet with which fence is preempted. Should driver handle any other thing for this?
>					
>				
>			
>		
>	
>
>
>— 
>Sincerely Yours,
>Pixel
>
>
>
>
>
>
>
>On 13/10/2017, 4:51 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:
>
>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>The question is where is this code for Tonga and Vega? I can't find a 
>>reference to fence_offs in the GFX8 nor GFX9 code we have in 
>>amd-staging-drm-next.
>>
>>And if it doesn't depend on the fence_offs in the ring printing it like 
>>this is just pure speculation.
>>
>>Regards,
>>Christian.
>>
>>Am 13.10.2017 um 10:41 schrieb Ding, Pixel:
>>> Thanks Christian,
>>>
>>> I’m not sure if I get your point, but yes the preemption fence offset could be changed.
>>>
>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>
>>>
>>> —
>>> Sincerely Yours,
>>> Pixel
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>>>
>>>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>>>> From: pding <Pixel.Ding@amd.com>
>>>>>
>>>>> Only report fence for GFX ring. This can help checking MCBP feature.
>>>>>
>>>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>>>>    1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>> index 09d5a5c..2044758 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>>>>    			   atomic_read(&ring->fence_drv.last_seq));
>>>>>    		seq_printf(m, "Last emitted        0x%08x\n",
>>>>>    			   ring->fence_drv.sync_seq);
>>>>> +
>>>>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>>>>> +			break;
>>>> That should probably be "continue" instead of break, or otherwise you
>>>> don't print the other fences any more.
>>>>
>>>>> +
>>>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>>> Is the code to put the preemption fence there already upstream?
>>>>
>>>> If yes do we really do this like that for all supported generations?
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> +
>>>>>    	}
>>>>>    	return 0;
>>>>>    }
>>>>
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]                     ` <CDFBCA9E-4A71-4E40-A41B-1986811903C6-5C7GfCeVMHo@public.gmane.org>
  2017-10-13  9:09                       ` Ding, Pixel
@ 2017-10-13  9:10                       ` Christian König
       [not found]                         ` <75ae098d-234b-fe3a-dfea-18de99b2aa3c-5C7GfCeVMHo@public.gmane.org>
  1 sibling, 1 reply; 32+ messages in thread
From: Christian König @ 2017-10-13  9:10 UTC (permalink / raw)
  To: Ding, Pixel, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Li, Bingley

Hi Pixel,

> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred.
That is correct.

But my question is why you want to print a different value here:
> +		seq_printf(m, "Last preempted      0x%08x\n",
> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));

That code prints two dw after the address where the CPU writes the 
seqno. Where is the code which setups the CP to do this?

As far as I can see that line should always print just zero (or some 
random number if the memory is actually not initialized to anything).

Regards,
Christian.

Am 13.10.2017 um 11:03 schrieb Ding, Pixel:
> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred. Then if there is a value here we can generally know packet with which fence is preempted. Should driver handle any other thing for this?
> 					
> 				
> 			
> 		
> 	
>
>
> —
> Sincerely Yours,
> Pixel
>
>
>
>
>
>
>
> On 13/10/2017, 4:51 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:
>
>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>> The question is where is this code for Tonga and Vega? I can't find a
>> reference to fence_offs in the GFX8 nor GFX9 code we have in
>> amd-staging-drm-next.
>>
>> And if it doesn't depend on the fence_offs in the ring printing it like
>> this is just pure speculation.
>>
>> Regards,
>> Christian.
>>
>> Am 13.10.2017 um 10:41 schrieb Ding, Pixel:
>>> Thanks Christian,
>>>
>>> I’m not sure if I get your point, but yes the preemption fence offset could be changed.
>>>
>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>
>>>
>>> —
>>> Sincerely Yours,
>>> Pixel
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>>>
>>>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>>>> From: pding <Pixel.Ding@amd.com>
>>>>>
>>>>> Only report fence for GFX ring. This can help checking MCBP feature.
>>>>>
>>>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>>>> ---
>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>>>>     1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>> index 09d5a5c..2044758 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>>>>     			   atomic_read(&ring->fence_drv.last_seq));
>>>>>     		seq_printf(m, "Last emitted        0x%08x\n",
>>>>>     			   ring->fence_drv.sync_seq);
>>>>> +
>>>>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>>>>> +			break;
>>>> That should probably be "continue" instead of break, or otherwise you
>>>> don't print the other fences any more.
>>>>
>>>>> +
>>>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>>> Is the code to put the preemption fence there already upstream?
>>>>
>>>> If yes do we really do this like that for all supported generations?
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> +
>>>>>     	}
>>>>>     	return 0;
>>>>>     }


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
       [not found]     ` <1507883180-5626-5-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-13  9:16       ` Christian König
       [not found]         ` <d800f449-8155-34a5-b9c3-c17ba0114ca1-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-10-13  9:34       ` Liu, Monk
  1 sibling, 1 reply; 32+ messages in thread
From: Christian König @ 2017-10-13  9:16 UTC (permalink / raw)
  To: Pixel Ding, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Bingley.Li-5C7GfCeVMHo

Am 13.10.2017 um 10:26 schrieb Pixel Ding:
> From: pding <Pixel.Ding@amd.com>
>
> Register accessing is performed when IRQ is disabled. Never sleep in
> this function.
>
> Known issue: dead sleep in many use cases of index/data registers.
>
> Signed-off-by: pding <Pixel.Ding@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  8 ++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 55 ++++++++++++++++++------------
>   6 files changed, 40 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index ca212ef..f9de756 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -886,6 +886,7 @@ struct amdgpu_kiq {
>   	u64			eop_gpu_addr;
>   	struct amdgpu_bo	*eop_obj;
>   	struct mutex		ring_mutex;

You can remove the ring_mutex if you don't use it any more.

> +	spinlock_t              ring_lock;
>   	struct amdgpu_ring	ring;
>   	struct amdgpu_irq_src	irq;
>   };
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index ab8f0d6..1197274 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -109,10 +109,8 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,
>   {
>   	uint32_t ret;
>   
> -	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
> -		BUG_ON(in_interrupt());
> +	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
>   		return amdgpu_virt_kiq_rreg(adev, reg);
> -	}
>   
>   	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
>   		ret = readl(((void __iomem *)adev->rmmio) + (reg * 4));
> @@ -137,10 +135,8 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
>   		adev->last_mm_index = v;
>   	}
>   
> -	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
> -		BUG_ON(in_interrupt());
> +	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
>   		return amdgpu_virt_kiq_wreg(adev, reg, v);
> -	}
>   
>   	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
>   		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index 2044758..c6c7bf3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -109,7 +109,7 @@ static void amdgpu_fence_write(struct amdgpu_ring *ring, u32 seq)
>    * Reads a fence value from memory (all asics).
>    * Returns the value of the fence read from memory.
>    */
> -static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
> +u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>   {
>   	struct amdgpu_fence_driver *drv = &ring->fence_drv;
>   	u32 seq = 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 4f6c68f..46fa88c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -186,6 +186,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
>   	int r = 0;
>   
>   	mutex_init(&kiq->ring_mutex);
> +	spin_lock_init(&kiq->ring_lock);
>   
>   	r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
>   	if (r)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index af8e544..a4fa923 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -89,6 +89,7 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
>   void amdgpu_fence_driver_suspend(struct amdgpu_device *adev);
>   void amdgpu_fence_driver_resume(struct amdgpu_device *adev);
>   int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence);
> +u32 amdgpu_fence_read(struct amdgpu_ring *ring);
>   void amdgpu_fence_process(struct amdgpu_ring *ring);
>   int amdgpu_fence_wait_empty(struct amdgpu_ring *ring);
>   unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> index ab05121..9757df1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> @@ -22,7 +22,7 @@
>    */
>   
>   #include "amdgpu.h"
> -#define MAX_KIQ_REG_WAIT	100000
> +#define MAX_KIQ_REG_WAIT	100000000 /* in usecs */
You want to busy wait 100 seconds for an result?

Forget it that is way to long you will certainly run into problems with 
the NMI much earlier.

>   
>   int amdgpu_allocate_static_csa(struct amdgpu_device *adev)
>   {
> @@ -113,28 +113,32 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
>   
>   uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
>   {
> -	signed long r;
> -	uint32_t val;
> -	struct dma_fence *f;
> +	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
> +	unsigned long flags;
> +	uint32_t val, seq, wait_seq;
>   	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>   	struct amdgpu_ring *ring = &kiq->ring;
>   
>   	BUG_ON(!ring->funcs->emit_rreg);
>   
> -	mutex_lock(&kiq->ring_mutex);
> +	spin_lock_irqsave(&kiq->ring_lock, flags);
>   	amdgpu_ring_alloc(ring, 32);
>   	amdgpu_ring_emit_rreg(ring, reg);
> -	amdgpu_fence_emit(ring, &f);
> +	wait_seq = ++ring->fence_drv.sync_seq;
> +	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
> +			       wait_seq, AMDGPU_FENCE_FLAG_INT);

Do you still need to send an interrupt when you busy wait? I don't think so.

>   	amdgpu_ring_commit(ring);
> -	mutex_unlock(&kiq->ring_mutex);
> -
> -	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
> -	dma_fence_put(f);
> -	if (r < 1) {
> -		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
> +	spin_unlock_irqrestore(&kiq->ring_lock, flags);
> +	do {
> +		seq = amdgpu_fence_read(ring);
> +		udelay(5);
> +		timeout -= 5;
> +	} while (seq < wait_seq && timeout > 0);

You need to correctly handle sequence wrap around here.

> +
> +	if (timeout < 1) {
> +		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
>   		return ~0;
>   	}
> -
>   	val = adev->wb.wb[adev->virt.reg_val_offs];
>   
>   	return val;
> @@ -142,24 +146,33 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
>   
>   void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)
>   {
> -	signed long r;
> +	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
> +	unsigned long flags;
> +	uint32_t seq, wait_seq;
>   	struct dma_fence *f;
>   	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>   	struct amdgpu_ring *ring = &kiq->ring;
>   
>   	BUG_ON(!ring->funcs->emit_wreg);
>   
> -	mutex_lock(&kiq->ring_mutex);
> +	spin_lock_irqsave(&kiq->ring_lock, flags);
>   	amdgpu_ring_alloc(ring, 32);
>   	amdgpu_ring_emit_wreg(ring, reg, v);
> -	amdgpu_fence_emit(ring, &f);
> +	/* amdgpu_fence_emit(ring, &f); */
> +	wait_seq = ++ring->fence_drv.sync_seq;
> +	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
> +			       wait_seq, AMDGPU_FENCE_FLAG_INT);
>   	amdgpu_ring_commit(ring);
> -	mutex_unlock(&kiq->ring_mutex);
> +	spin_unlock_irqrestore(&kiq->ring_lock, flags);
> +
> +	do {
> +		seq = amdgpu_fence_read(ring);
> +		udelay(5);
> +		timeout -= 5;
> +	} while (seq < wait_seq && timeout > 0);

Same here. It would probably be better to have a function for this in 
amdgpu_fence.c

Regards,
Christian.

>   
> -	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
> -	if (r < 1)
> -		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
> -	dma_fence_put(f);
> +	if (timeout < 1)
> +		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
>   }
>   
>   /**


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: workaround for VM fault caused by SDMA set_wptr
       [not found]     ` <1507883180-5626-3-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-13  9:17       ` Christian König
       [not found]         ` <536fdd74-43d8-1aaf-ca7c-b3d519b8486c-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Christian König @ 2017-10-13  9:17 UTC (permalink / raw)
  To: Pixel Ding, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Bingley.Li-5C7GfCeVMHo

Am 13.10.2017 um 10:26 schrieb Pixel Ding:
> From: pding <Pixel.Ding@amd.com>
>
> The polling memory was standalone in VRAM before, so the HDP flush
> introduced latency that hides a VM fault issue. Now polling memory
> leverages the WB in system memory and HDP flush is not required, the
> VM fault at same page happens.
>
> Add delay back to workaround until the root cause is found.
>
> Tests: VP1 or launch 40 instances of glxinfo at the same time.
>
> Signed-off-by: pding <Pixel.Ding@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> index b1de44f..5c4bbe1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> @@ -381,6 +381,9 @@ static void sdma_v3_0_ring_set_wptr(struct amdgpu_ring *ring)
>   	if (ring->use_doorbell) {
>   		/* XXX check if swapping is necessary on BE */
>   		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr) << 2;
> +		/* workaround: VM fault always happen at page 2046 */
> +		if (amdgpu_sriov_vf(adev))
> +			udelay(500);

Have you tried using a memory barrier here?

That looks like it will have massive impact on performance.

Regards,
Christian.

>   		WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2);
>   	} else {
>   		int me = (ring == &ring->adev->sdma.instance[0].ring) ? 0 : 1;


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]                         ` <75ae098d-234b-fe3a-dfea-18de99b2aa3c-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-13  9:18                           ` Ding, Pixel
       [not found]                             ` <FE29B287-93F7-427A-8A3E-8DEE50BC888C-5C7GfCeVMHo@public.gmane.org>
  2017-10-13  9:26                           ` Liu, Monk
  2017-10-13  9:28                           ` Liu, Monk
  2 siblings, 1 reply; 32+ messages in thread
From: Ding, Pixel @ 2017-10-13  9:18 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Liu, Monk
  Cc: Li, Bingley

[-- Attachment #1: Type: text/plain, Size: 4379 bytes --]

OK I get it…
when we use the fence_offs to submit fence to HW, in fact it’s a 8 DW fence not a 2 DW.

The format is:
Completed Fence      0x0 Fence written here if frame completed normally
Preempted Fence      0x2 Bit set in CP_VMID_PREEMPT and preemption occurred
Reset Fence          0x4 Bit is set in CP_VMID_RESET and reset occurred
Preempted then Reset 0x6 Both preemption and reset occurred

					
				
			
		
	
Then I notice that the 8DW wb allocation patch is missed on staging driver.


Hi Monk,

Can you take a look? I thought the 8DW WB patch is already upstream.

— 
Sincerely Yours,
Pixel







On 13/10/2017, 5:10 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:

>Hi Pixel,
>
>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred.
>That is correct.
>
>But my question is why you want to print a different value here:
>> +		seq_printf(m, "Last preempted      0x%08x\n",
>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>
>That code prints two dw after the address where the CPU writes the 
>seqno. Where is the code which setups the CP to do this?
>
>As far as I can see that line should always print just zero (or some 
>random number if the memory is actually not initialized to anything).
>
>Regards,
>Christian.
>
>Am 13.10.2017 um 11:03 schrieb Ding, Pixel:
>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred. Then if there is a value here we can generally know packet with which fence is preempted. Should driver handle any other thing for this?
>> 					
>> 				
>> 			
>> 		
>> 	
>>
>>
>> —
>> Sincerely Yours,
>> Pixel
>>
>>
>>
>>
>>
>>
>>
>> On 13/10/2017, 4:51 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:
>>
>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>> The question is where is this code for Tonga and Vega? I can't find a
>>> reference to fence_offs in the GFX8 nor GFX9 code we have in
>>> amd-staging-drm-next.
>>>
>>> And if it doesn't depend on the fence_offs in the ring printing it like
>>> this is just pure speculation.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 13.10.2017 um 10:41 schrieb Ding, Pixel:
>>>> Thanks Christian,
>>>>
>>>> I’m not sure if I get your point, but yes the preemption fence offset could be changed.
>>>>
>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>>
>>>>
>>>> —
>>>> Sincerely Yours,
>>>> Pixel
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>>>>
>>>>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>>>>> From: pding <Pixel.Ding@amd.com>
>>>>>>
>>>>>> Only report fence for GFX ring. This can help checking MCBP feature.
>>>>>>
>>>>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>>>>>     1 file changed, 7 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>> index 09d5a5c..2044758 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>>>>>     			   atomic_read(&ring->fence_drv.last_seq));
>>>>>>     		seq_printf(m, "Last emitted        0x%08x\n",
>>>>>>     			   ring->fence_drv.sync_seq);
>>>>>> +
>>>>>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>>>>>> +			break;
>>>>> That should probably be "continue" instead of break, or otherwise you
>>>>> don't print the other fences any more.
>>>>>
>>>>>> +
>>>>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>>>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>>>> Is the code to put the preemption fence there already upstream?
>>>>>
>>>>> If yes do we really do this like that for all supported generations?
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> +
>>>>>>     	}
>>>>>>     	return 0;
>>>>>>     }
>
>

[-- Attachment #2: default[RtGN].png --]
[-- Type: image/png, Size: 86 bytes --]

[-- Attachment #3: default[LOie].png --]
[-- Type: image/png, Size: 86 bytes --]

[-- Attachment #4: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]                             ` <FE29B287-93F7-427A-8A3E-8DEE50BC888C-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-13  9:19                               ` Liu, Monk
       [not found]                                 ` <BLUPR12MB04491225CA802A960F4C752784480-7LeqcoF/hwpTIQvHjXdJlwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  2017-10-13  9:25                               ` Liu, Monk
  1 sibling, 1 reply; 32+ messages in thread
From: Liu, Monk @ 2017-10-13  9:19 UTC (permalink / raw)
  To: Ding, Pixel, Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Li, Bingley

Pixel,

On drm-next, we always allocate 8DW for all WB, you can check the wb_get routine on detail

BR Monk

-----Original Message-----
From: Ding, Pixel 
Sent: 2017年10月13日 17:19
To: Koenig, Christian <Christian.Koenig@amd.com>; amd-gfx@lists.freedesktop.org; Liu, Monk <Monk.Liu@amd.com>
Cc: Li, Bingley <Bingley.Li@amd.com>
Subject: Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info

OK I get it…
when we use the fence_offs to submit fence to HW, in fact it’s a 8 DW fence not a 2 DW.

The format is:
Completed Fence      0x0 Fence written here if frame completed normally
Preempted Fence      0x2 Bit set in CP_VMID_PREEMPT and preemption occurred
Reset Fence          0x4 Bit is set in CP_VMID_RESET and reset occurred
Preempted then Reset 0x6 Both preemption and reset occurred

					
				
			
		
	
Then I notice that the 8DW wb allocation patch is missed on staging driver.


Hi Monk,

Can you take a look? I thought the 8DW WB patch is already upstream.

— 
Sincerely Yours,
Pixel







On 13/10/2017, 5:10 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:

>Hi Pixel,
>
>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred.
>That is correct.
>
>But my question is why you want to print a different value here:
>> +		seq_printf(m, "Last preempted      0x%08x\n",
>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>
>That code prints two dw after the address where the CPU writes the 
>seqno. Where is the code which setups the CP to do this?
>
>As far as I can see that line should always print just zero (or some 
>random number if the memory is actually not initialized to anything).
>
>Regards,
>Christian.
>
>Am 13.10.2017 um 11:03 schrieb Ding, Pixel:
>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred. Then if there is a value here we can generally know packet with which fence is preempted. Should driver handle any other thing for this?
>> 					
>> 				
>> 			
>> 		
>> 	
>>
>>
>> —
>> Sincerely Yours,
>> Pixel
>>
>>
>>
>>
>>
>>
>>
>> On 13/10/2017, 4:51 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:
>>
>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>> The question is where is this code for Tonga and Vega? I can't find a
>>> reference to fence_offs in the GFX8 nor GFX9 code we have in
>>> amd-staging-drm-next.
>>>
>>> And if it doesn't depend on the fence_offs in the ring printing it like
>>> this is just pure speculation.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 13.10.2017 um 10:41 schrieb Ding, Pixel:
>>>> Thanks Christian,
>>>>
>>>> I’m not sure if I get your point, but yes the preemption fence offset could be changed.
>>>>
>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>>
>>>>
>>>> —
>>>> Sincerely Yours,
>>>> Pixel
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>>>>
>>>>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>>>>> From: pding <Pixel.Ding@amd.com>
>>>>>>
>>>>>> Only report fence for GFX ring. This can help checking MCBP feature.
>>>>>>
>>>>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>>>>>     1 file changed, 7 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>> index 09d5a5c..2044758 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>>>>>     			   atomic_read(&ring->fence_drv.last_seq));
>>>>>>     		seq_printf(m, "Last emitted        0x%08x\n",
>>>>>>     			   ring->fence_drv.sync_seq);
>>>>>> +
>>>>>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>>>>>> +			break;
>>>>> That should probably be "continue" instead of break, or otherwise you
>>>>> don't print the other fences any more.
>>>>>
>>>>>> +
>>>>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>>>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>>>> Is the code to put the preemption fence there already upstream?
>>>>>
>>>>> If yes do we really do this like that for all supported generations?
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> +
>>>>>>     	}
>>>>>>     	return 0;
>>>>>>     }
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: workaround for VM fault caused by SDMA set_wptr
       [not found]         ` <536fdd74-43d8-1aaf-ca7c-b3d519b8486c-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-10-13  9:21           ` Ding, Pixel
  0 siblings, 0 replies; 32+ messages in thread
From: Ding, Pixel @ 2017-10-13  9:21 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Li, Bingley

Yes I tried smp_mb but it doesn’t help…
We will follow up this issue continuously until fix the root cause.
— 
Sincerely Yours,
Pixel







On 13/10/2017, 5:17 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:

>Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>> From: pding <Pixel.Ding@amd.com>
>>
>> The polling memory was standalone in VRAM before, so the HDP flush
>> introduced latency that hides a VM fault issue. Now polling memory
>> leverages the WB in system memory and HDP flush is not required, the
>> VM fault at same page happens.
>>
>> Add delay back to workaround until the root cause is found.
>>
>> Tests: VP1 or launch 40 instances of glxinfo at the same time.
>>
>> Signed-off-by: pding <Pixel.Ding@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
>> index b1de44f..5c4bbe1 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
>> @@ -381,6 +381,9 @@ static void sdma_v3_0_ring_set_wptr(struct amdgpu_ring *ring)
>>   	if (ring->use_doorbell) {
>>   		/* XXX check if swapping is necessary on BE */
>>   		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr) << 2;
>> +		/* workaround: VM fault always happen at page 2046 */
>> +		if (amdgpu_sriov_vf(adev))
>> +			udelay(500);
>
>Have you tried using a memory barrier here?
>
>That looks like it will have massive impact on performance.
>
>Regards,
>Christian.
>
>>   		WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2);
>>   	} else {
>>   		int me = (ring == &ring->adev->sdma.instance[0].ring) ? 0 : 1;
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]                                 ` <BLUPR12MB04491225CA802A960F4C752784480-7LeqcoF/hwpTIQvHjXdJlwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2017-10-13  9:24                                   ` Ding, Pixel
  0 siblings, 0 replies; 32+ messages in thread
From: Ding, Pixel @ 2017-10-13  9:24 UTC (permalink / raw)
  To: Liu, Monk, Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Li, Bingley

Yes I see the “drm/amdgpu: use 256 bit buffers for all wb allocations (v2)”.

Hi Christian, 

So it seems all good, right?

— 
Sincerely Yours,
Pixel







On 13/10/2017, 5:19 PM, "Liu, Monk" <Monk.Liu@amd.com> wrote:

>Pixel,
>
>On drm-next, we always allocate 8DW for all WB, you can check the wb_get routine on detail
>
>BR Monk
>
>-----Original Message-----
>From: Ding, Pixel 
>Sent: 2017年10月13日 17:19
>To: Koenig, Christian <Christian.Koenig@amd.com>; amd-gfx@lists.freedesktop.org; Liu, Monk <Monk.Liu@amd.com>
>Cc: Li, Bingley <Bingley.Li@amd.com>
>Subject: Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
>
>OK I get it…
>when we use the fence_offs to submit fence to HW, in fact it’s a 8 DW fence not a 2 DW.
>
>The format is:
>Completed Fence      0x0 Fence written here if frame completed normally
>Preempted Fence      0x2 Bit set in CP_VMID_PREEMPT and preemption occurred
>Reset Fence          0x4 Bit is set in CP_VMID_RESET and reset occurred
>Preempted then Reset 0x6 Both preemption and reset occurred
>
>					
>				
>			
>		
>	
>Then I notice that the 8DW wb allocation patch is missed on staging driver.
>
>
>Hi Monk,
>
>Can you take a look? I thought the 8DW WB patch is already upstream.
>
>— 
>Sincerely Yours,
>Pixel
>
>
>
>
>
>
>
>On 13/10/2017, 5:10 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:
>
>>Hi Pixel,
>>
>>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred.
>>That is correct.
>>
>>But my question is why you want to print a different value here:
>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>
>>That code prints two dw after the address where the CPU writes the 
>>seqno. Where is the code which setups the CP to do this?
>>
>>As far as I can see that line should always print just zero (or some 
>>random number if the memory is actually not initialized to anything).
>>
>>Regards,
>>Christian.
>>
>>Am 13.10.2017 um 11:03 schrieb Ding, Pixel:
>>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred. Then if there is a value here we can generally know packet with which fence is preempted. Should driver handle any other thing for this?
>>> 					
>>> 				
>>> 			
>>> 		
>>> 	
>>>
>>>
>>> —
>>> Sincerely Yours,
>>> Pixel
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 13/10/2017, 4:51 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:
>>>
>>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>> The question is where is this code for Tonga and Vega? I can't find a
>>>> reference to fence_offs in the GFX8 nor GFX9 code we have in
>>>> amd-staging-drm-next.
>>>>
>>>> And if it doesn't depend on the fence_offs in the ring printing it like
>>>> this is just pure speculation.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 13.10.2017 um 10:41 schrieb Ding, Pixel:
>>>>> Thanks Christian,
>>>>>
>>>>> I’m not sure if I get your point, but yes the preemption fence offset could be changed.
>>>>>
>>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>>>
>>>>>
>>>>> —
>>>>> Sincerely Yours,
>>>>> Pixel
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>>>>>
>>>>>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>>>>>> From: pding <Pixel.Ding@amd.com>
>>>>>>>
>>>>>>> Only report fence for GFX ring. This can help checking MCBP feature.
>>>>>>>
>>>>>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>>>>>>     1 file changed, 7 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>>> index 09d5a5c..2044758 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>>>>>>     			   atomic_read(&ring->fence_drv.last_seq));
>>>>>>>     		seq_printf(m, "Last emitted        0x%08x\n",
>>>>>>>     			   ring->fence_drv.sync_seq);
>>>>>>> +
>>>>>>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>>>>>>> +			break;
>>>>>> That should probably be "continue" instead of break, or otherwise you
>>>>>> don't print the other fences any more.
>>>>>>
>>>>>>> +
>>>>>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>>>>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>>>>> Is the code to put the preemption fence there already upstream?
>>>>>>
>>>>>> If yes do we really do this like that for all supported generations?
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> +
>>>>>>>     	}
>>>>>>>     	return 0;
>>>>>>>     }
>>
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]                             ` <FE29B287-93F7-427A-8A3E-8DEE50BC888C-5C7GfCeVMHo@public.gmane.org>
  2017-10-13  9:19                               ` Liu, Monk
@ 2017-10-13  9:25                               ` Liu, Monk
  1 sibling, 0 replies; 32+ messages in thread
From: Liu, Monk @ 2017-10-13  9:25 UTC (permalink / raw)
  To: Ding, Pixel, Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Li, Bingley

Yes, 

CP hw assume the fence is 64bit length, so there are 4 type of fence and need 256bits totally

The preempted fence is at offset 64bit, 

And I assume that we print all of those fence type when under SRIOV mode

BR Monk

-----Original Message-----
From: Ding, Pixel 
Sent: 2017年10月13日 17:19
To: Koenig, Christian <Christian.Koenig@amd.com>; amd-gfx@lists.freedesktop.org; Liu, Monk <Monk.Liu@amd.com>
Cc: Li, Bingley <Bingley.Li@amd.com>
Subject: Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info

OK I get it…
when we use the fence_offs to submit fence to HW, in fact it’s a 8 DW fence not a 2 DW.

The format is:
Completed Fence      0x0 Fence written here if frame completed normally
Preempted Fence      0x2 Bit set in CP_VMID_PREEMPT and preemption occurred
Reset Fence          0x4 Bit is set in CP_VMID_RESET and reset occurred
Preempted then Reset 0x6 Both preemption and reset occurred

					
				
			
		
	
Then I notice that the 8DW wb allocation patch is missed on staging driver.


Hi Monk,

Can you take a look? I thought the 8DW WB patch is already upstream.

— 
Sincerely Yours,
Pixel







On 13/10/2017, 5:10 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:

>Hi Pixel,
>
>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred.
>That is correct.
>
>But my question is why you want to print a different value here:
>> +		seq_printf(m, "Last preempted      0x%08x\n",
>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>
>That code prints two dw after the address where the CPU writes the 
>seqno. Where is the code which setups the CP to do this?
>
>As far as I can see that line should always print just zero (or some 
>random number if the memory is actually not initialized to anything).
>
>Regards,
>Christian.
>
>Am 13.10.2017 um 11:03 schrieb Ding, Pixel:
>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred. Then if there is a value here we can generally know packet with which fence is preempted. Should driver handle any other thing for this?
>> 					
>> 				
>> 			
>> 		
>> 	
>>
>>
>> —
>> Sincerely Yours,
>> Pixel
>>
>>
>>
>>
>>
>>
>>
>> On 13/10/2017, 4:51 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:
>>
>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>> The question is where is this code for Tonga and Vega? I can't find a
>>> reference to fence_offs in the GFX8 nor GFX9 code we have in
>>> amd-staging-drm-next.
>>>
>>> And if it doesn't depend on the fence_offs in the ring printing it like
>>> this is just pure speculation.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 13.10.2017 um 10:41 schrieb Ding, Pixel:
>>>> Thanks Christian,
>>>>
>>>> I’m not sure if I get your point, but yes the preemption fence offset could be changed.
>>>>
>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>>
>>>>
>>>> —
>>>> Sincerely Yours,
>>>> Pixel
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>>>>
>>>>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>>>>> From: pding <Pixel.Ding@amd.com>
>>>>>>
>>>>>> Only report fence for GFX ring. This can help checking MCBP feature.
>>>>>>
>>>>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>>>>>     1 file changed, 7 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>> index 09d5a5c..2044758 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>>>>>     			   atomic_read(&ring->fence_drv.last_seq));
>>>>>>     		seq_printf(m, "Last emitted        0x%08x\n",
>>>>>>     			   ring->fence_drv.sync_seq);
>>>>>> +
>>>>>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>>>>>> +			break;
>>>>> That should probably be "continue" instead of break, or otherwise you
>>>>> don't print the other fences any more.
>>>>>
>>>>>> +
>>>>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>>>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>>>> Is the code to put the preemption fence there already upstream?
>>>>>
>>>>> If yes do we really do this like that for all supported generations?
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> +
>>>>>>     	}
>>>>>>     	return 0;
>>>>>>     }
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]                         ` <75ae098d-234b-fe3a-dfea-18de99b2aa3c-5C7GfCeVMHo@public.gmane.org>
  2017-10-13  9:18                           ` Ding, Pixel
@ 2017-10-13  9:26                           ` Liu, Monk
  2017-10-13  9:28                           ` Liu, Monk
  2 siblings, 0 replies; 32+ messages in thread
From: Liu, Monk @ 2017-10-13  9:26 UTC (permalink / raw)
  To: Koenig, Christian, Ding, Pixel, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Li, Bingley

WB _init should clear this buffer in very early stage, so it should output 0 if for BARE-METAL or no preemption occurred 

-----Original Message-----
From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Christian K?nig
Sent: 2017年10月13日 17:10
To: Ding, Pixel <Pixel.Ding@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Li, Bingley <Bingley.Li@amd.com>
Subject: Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info

Hi Pixel,

> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred.
That is correct.

But my question is why you want to print a different value here:
> +		seq_printf(m, "Last preempted      0x%08x\n",
> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));

That code prints two dw after the address where the CPU writes the seqno. Where is the code which setups the CP to do this?

As far as I can see that line should always print just zero (or some random number if the memory is actually not initialized to anything).

Regards,
Christian.

Am 13.10.2017 um 11:03 schrieb Ding, Pixel:
> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred. Then if there is a value here we can generally know packet with which fence is preempted. Should driver handle any other thing for this?
> 					
> 				
> 			
> 		
> 	
>
>
> —
> Sincerely Yours,
> Pixel
>
>
>
>
>
>
>
> On 13/10/2017, 4:51 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:
>
>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>> The question is where is this code for Tonga and Vega? I can't find a 
>> reference to fence_offs in the GFX8 nor GFX9 code we have in 
>> amd-staging-drm-next.
>>
>> And if it doesn't depend on the fence_offs in the ring printing it 
>> like this is just pure speculation.
>>
>> Regards,
>> Christian.
>>
>> Am 13.10.2017 um 10:41 schrieb Ding, Pixel:
>>> Thanks Christian,
>>>
>>> I’m not sure if I get your point, but yes the preemption fence offset could be changed.
>>>
>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>
>>>
>>> —
>>> Sincerely Yours,
>>> Pixel
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>>>
>>>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>>>> From: pding <Pixel.Ding@amd.com>
>>>>>
>>>>> Only report fence for GFX ring. This can help checking MCBP feature.
>>>>>
>>>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>>>> ---
>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>>>>     1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>> index 09d5a5c..2044758 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>>>>     			   atomic_read(&ring->fence_drv.last_seq));
>>>>>     		seq_printf(m, "Last emitted        0x%08x\n",
>>>>>     			   ring->fence_drv.sync_seq);
>>>>> +
>>>>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>>>>> +			break;
>>>> That should probably be "continue" instead of break, or otherwise 
>>>> you don't print the other fences any more.
>>>>
>>>>> +
>>>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>>> Is the code to put the preemption fence there already upstream?
>>>>
>>>> If yes do we really do this like that for all supported generations?
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> +
>>>>>     	}
>>>>>     	return 0;
>>>>>     }


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]                         ` <75ae098d-234b-fe3a-dfea-18de99b2aa3c-5C7GfCeVMHo@public.gmane.org>
  2017-10-13  9:18                           ` Ding, Pixel
  2017-10-13  9:26                           ` Liu, Monk
@ 2017-10-13  9:28                           ` Liu, Monk
       [not found]                             ` <BLUPR12MB0449C09F747B3C0294345DC484480-7LeqcoF/hwpTIQvHjXdJlwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  2 siblings, 1 reply; 32+ messages in thread
From: Liu, Monk @ 2017-10-13  9:28 UTC (permalink / raw)
  To: Koenig, Christian, Ding, Pixel, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Li, Bingley

@Ding, Pixel

+		seq_printf(m, "Last preempted      0x%08x\n",
+			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));

Please handle other type fence as well:

Preempted fence
Reset fence
Reset and preempted fence


BR Monk
-----Original Message-----
From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Christian K?nig
Sent: 2017年10月13日 17:10
To: Ding, Pixel <Pixel.Ding@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Li, Bingley <Bingley.Li@amd.com>
Subject: Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info

Hi Pixel,

> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred.
That is correct.

But my question is why you want to print a different value here:
> +		seq_printf(m, "Last preempted      0x%08x\n",
> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));

That code prints two dw after the address where the CPU writes the seqno. Where is the code which setups the CP to do this?

As far as I can see that line should always print just zero (or some random number if the memory is actually not initialized to anything).

Regards,
Christian.

Am 13.10.2017 um 11:03 schrieb Ding, Pixel:
> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred. Then if there is a value here we can generally know packet with which fence is preempted. Should driver handle any other thing for this?
> 					
> 				
> 			
> 		
> 	
>
>
> —
> Sincerely Yours,
> Pixel
>
>
>
>
>
>
>
> On 13/10/2017, 4:51 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:
>
>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>> The question is where is this code for Tonga and Vega? I can't find a 
>> reference to fence_offs in the GFX8 nor GFX9 code we have in 
>> amd-staging-drm-next.
>>
>> And if it doesn't depend on the fence_offs in the ring printing it 
>> like this is just pure speculation.
>>
>> Regards,
>> Christian.
>>
>> Am 13.10.2017 um 10:41 schrieb Ding, Pixel:
>>> Thanks Christian,
>>>
>>> I’m not sure if I get your point, but yes the preemption fence offset could be changed.
>>>
>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>
>>>
>>> —
>>> Sincerely Yours,
>>> Pixel
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>>>
>>>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>>>> From: pding <Pixel.Ding@amd.com>
>>>>>
>>>>> Only report fence for GFX ring. This can help checking MCBP feature.
>>>>>
>>>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>>>> ---
>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>>>>     1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>> index 09d5a5c..2044758 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>>>>     			   atomic_read(&ring->fence_drv.last_seq));
>>>>>     		seq_printf(m, "Last emitted        0x%08x\n",
>>>>>     			   ring->fence_drv.sync_seq);
>>>>> +
>>>>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>>>>> +			break;
>>>> That should probably be "continue" instead of break, or otherwise 
>>>> you don't print the other fences any more.
>>>>
>>>>> +
>>>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>>> Is the code to put the preemption fence there already upstream?
>>>>
>>>> If yes do we really do this like that for all supported generations?
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> +
>>>>>     	}
>>>>>     	return 0;
>>>>>     }


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
       [not found]     ` <1507883180-5626-5-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
  2017-10-13  9:16       ` Christian König
@ 2017-10-13  9:34       ` Liu, Monk
       [not found]         ` <BLUPR12MB0449383454665A6D4916951884480-7LeqcoF/hwpTIQvHjXdJlwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  1 sibling, 1 reply; 32+ messages in thread
From: Liu, Monk @ 2017-10-13  9:34 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Ding, Pixel, Li, Bingley

Pixel

I don't think this will work well, my suggestion is you add a new function like:
amdgpu_wreg_kiq_busy(),

which will write registers through KIQ and use polling/busy wait, and the original amdgu_wreg_no_kiq() can be still there.

When you need to disable sleep like in IRQ CONTEXT, you can call wreg_kiq_busy() or wreg_no_kiq(),

But don't just change the original function 

BR Monk

-----Original Message-----
From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Pixel Ding
Sent: 2017年10月13日 16:26
To: amd-gfx@lists.freedesktop.org
Cc: Ding, Pixel <Pixel.Ding@amd.com>; Li, Bingley <Bingley.Li@amd.com>
Subject: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing

From: pding <Pixel.Ding@amd.com>

Register accessing is performed when IRQ is disabled. Never sleep in this function.

Known issue: dead sleep in many use cases of index/data registers.

Signed-off-by: pding <Pixel.Ding@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  8 ++---  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 55 ++++++++++++++++++------------
 6 files changed, 40 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ca212ef..f9de756 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -886,6 +886,7 @@ struct amdgpu_kiq {
 	u64			eop_gpu_addr;
 	struct amdgpu_bo	*eop_obj;
 	struct mutex		ring_mutex;
+	spinlock_t              ring_lock;
 	struct amdgpu_ring	ring;
 	struct amdgpu_irq_src	irq;
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ab8f0d6..1197274 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -109,10 +109,8 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,  {
 	uint32_t ret;
 
-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
-		BUG_ON(in_interrupt());
+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
 		return amdgpu_virt_kiq_rreg(adev, reg);
-	}
 
 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
 		ret = readl(((void __iomem *)adev->rmmio) + (reg * 4)); @@ -137,10 +135,8 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
 		adev->last_mm_index = v;
 	}
 
-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
-		BUG_ON(in_interrupt());
+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
 		return amdgpu_virt_kiq_wreg(adev, reg, v);
-	}
 
 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
 		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4)); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 2044758..c6c7bf3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -109,7 +109,7 @@ static void amdgpu_fence_write(struct amdgpu_ring *ring, u32 seq)
  * Reads a fence value from memory (all asics).
  * Returns the value of the fence read from memory.
  */
-static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
+u32 amdgpu_fence_read(struct amdgpu_ring *ring)
 {
 	struct amdgpu_fence_driver *drv = &ring->fence_drv;
 	u32 seq = 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 4f6c68f..46fa88c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -186,6 +186,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
 	int r = 0;
 
 	mutex_init(&kiq->ring_mutex);
+	spin_lock_init(&kiq->ring_lock);
 
 	r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
 	if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index af8e544..a4fa923 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -89,6 +89,7 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,  void amdgpu_fence_driver_suspend(struct amdgpu_device *adev);  void amdgpu_fence_driver_resume(struct amdgpu_device *adev);  int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence);
+u32 amdgpu_fence_read(struct amdgpu_ring *ring);
 void amdgpu_fence_process(struct amdgpu_ring *ring);  int amdgpu_fence_wait_empty(struct amdgpu_ring *ring);  unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index ab05121..9757df1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -22,7 +22,7 @@
  */
 
 #include "amdgpu.h"
-#define MAX_KIQ_REG_WAIT	100000
+#define MAX_KIQ_REG_WAIT	100000000 /* in usecs */
 
 int amdgpu_allocate_static_csa(struct amdgpu_device *adev)  { @@ -113,28 +113,32 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
 
 uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)  {
-	signed long r;
-	uint32_t val;
-	struct dma_fence *f;
+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
+	unsigned long flags;
+	uint32_t val, seq, wait_seq;
 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
 	struct amdgpu_ring *ring = &kiq->ring;
 
 	BUG_ON(!ring->funcs->emit_rreg);
 
-	mutex_lock(&kiq->ring_mutex);
+	spin_lock_irqsave(&kiq->ring_lock, flags);
 	amdgpu_ring_alloc(ring, 32);
 	amdgpu_ring_emit_rreg(ring, reg);
-	amdgpu_fence_emit(ring, &f);
+	wait_seq = ++ring->fence_drv.sync_seq;
+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
 	amdgpu_ring_commit(ring);
-	mutex_unlock(&kiq->ring_mutex);
-
-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
-	dma_fence_put(f);
-	if (r < 1) {
-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
+	do {
+		seq = amdgpu_fence_read(ring);
+		udelay(5);
+		timeout -= 5;
+	} while (seq < wait_seq && timeout > 0);
+
+	if (timeout < 1) {
+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
 		return ~0;
 	}
-
 	val = adev->wb.wb[adev->virt.reg_val_offs];
 
 	return val;
@@ -142,24 +146,33 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
 
 void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)  {
-	signed long r;
+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
+	unsigned long flags;
+	uint32_t seq, wait_seq;
 	struct dma_fence *f;
 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
 	struct amdgpu_ring *ring = &kiq->ring;
 
 	BUG_ON(!ring->funcs->emit_wreg);
 
-	mutex_lock(&kiq->ring_mutex);
+	spin_lock_irqsave(&kiq->ring_lock, flags);
 	amdgpu_ring_alloc(ring, 32);
 	amdgpu_ring_emit_wreg(ring, reg, v);
-	amdgpu_fence_emit(ring, &f);
+	/* amdgpu_fence_emit(ring, &f); */
+	wait_seq = ++ring->fence_drv.sync_seq;
+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
 	amdgpu_ring_commit(ring);
-	mutex_unlock(&kiq->ring_mutex);
+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
+
+	do {
+		seq = amdgpu_fence_read(ring);
+		udelay(5);
+		timeout -= 5;
+	} while (seq < wait_seq && timeout > 0);
 
-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
-	if (r < 1)
-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
-	dma_fence_put(f);
+	if (timeout < 1)
+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
 }
 
 /**
--
2.9.5

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
       [not found]         ` <d800f449-8155-34a5-b9c3-c17ba0114ca1-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-10-13  9:35           ` Ding, Pixel
       [not found]             ` <67E0D85B-6C27-4481-BFB9-6DF9E66A7E03-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Ding, Pixel @ 2017-10-13  9:35 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Li, Bingley

On 13/10/2017, 5:16 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:



>Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>> From: pding <Pixel.Ding@amd.com>
>>
>> Register accessing is performed when IRQ is disabled. Never sleep in
>> this function.
>>
>> Known issue: dead sleep in many use cases of index/data registers.
>>
>> Signed-off-by: pding <Pixel.Ding@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  8 ++---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    |  1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 55 ++++++++++++++++++------------
>>   6 files changed, 40 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index ca212ef..f9de756 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -886,6 +886,7 @@ struct amdgpu_kiq {
>>   	u64			eop_gpu_addr;
>>   	struct amdgpu_bo	*eop_obj;
>>   	struct mutex		ring_mutex;
>
>You can remove the ring_mutex if you don't use it any more.
[Pixel] KFD still use the mutex, I didn’t change it also to spin lock, should I?
>
>> +	spinlock_t              ring_lock;
>>   	struct amdgpu_ring	ring;
>>   	struct amdgpu_irq_src	irq;
>>   };
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index ab8f0d6..1197274 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -109,10 +109,8 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,
>>   {
>>   	uint32_t ret;
>>   
>> -	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>> -		BUG_ON(in_interrupt());
>> +	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
>>   		return amdgpu_virt_kiq_rreg(adev, reg);
>> -	}
>>   
>>   	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
>>   		ret = readl(((void __iomem *)adev->rmmio) + (reg * 4));
>> @@ -137,10 +135,8 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
>>   		adev->last_mm_index = v;
>>   	}
>>   
>> -	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>> -		BUG_ON(in_interrupt());
>> +	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
>>   		return amdgpu_virt_kiq_wreg(adev, reg, v);
>> -	}
>>   
>>   	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
>>   		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>> index 2044758..c6c7bf3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>> @@ -109,7 +109,7 @@ static void amdgpu_fence_write(struct amdgpu_ring *ring, u32 seq)
>>    * Reads a fence value from memory (all asics).
>>    * Returns the value of the fence read from memory.
>>    */
>> -static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>> +u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>>   {
>>   	struct amdgpu_fence_driver *drv = &ring->fence_drv;
>>   	u32 seq = 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> index 4f6c68f..46fa88c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> @@ -186,6 +186,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
>>   	int r = 0;
>>   
>>   	mutex_init(&kiq->ring_mutex);
>> +	spin_lock_init(&kiq->ring_lock);
>>   
>>   	r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
>>   	if (r)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> index af8e544..a4fa923 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> @@ -89,6 +89,7 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
>>   void amdgpu_fence_driver_suspend(struct amdgpu_device *adev);
>>   void amdgpu_fence_driver_resume(struct amdgpu_device *adev);
>>   int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence);
>> +u32 amdgpu_fence_read(struct amdgpu_ring *ring);
>>   void amdgpu_fence_process(struct amdgpu_ring *ring);
>>   int amdgpu_fence_wait_empty(struct amdgpu_ring *ring);
>>   unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>> index ab05121..9757df1 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>> @@ -22,7 +22,7 @@
>>    */
>>   
>>   #include "amdgpu.h"
>> -#define MAX_KIQ_REG_WAIT	100000
>> +#define MAX_KIQ_REG_WAIT	100000000 /* in usecs */
>You want to busy wait 100 seconds for an result?
>
>Forget it that is way to long you will certainly run into problems with 
>the NMI much earlier.
>[Pixel] thanks, would change it to 10s.
>
>>   
>>   int amdgpu_allocate_static_csa(struct amdgpu_device *adev)
>>   {
>> @@ -113,28 +113,32 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
>>   
>>   uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
>>   {
>> -	signed long r;
>> -	uint32_t val;
>> -	struct dma_fence *f;
>> +	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>> +	unsigned long flags;
>> +	uint32_t val, seq, wait_seq;
>>   	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>>   	struct amdgpu_ring *ring = &kiq->ring;
>>   
>>   	BUG_ON(!ring->funcs->emit_rreg);
>>   
>> -	mutex_lock(&kiq->ring_mutex);
>> +	spin_lock_irqsave(&kiq->ring_lock, flags);
>>   	amdgpu_ring_alloc(ring, 32);
>>   	amdgpu_ring_emit_rreg(ring, reg);
>> -	amdgpu_fence_emit(ring, &f);
>> +	wait_seq = ++ring->fence_drv.sync_seq;
>> +	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>> +			       wait_seq, AMDGPU_FENCE_FLAG_INT);
>
>Do you still need to send an interrupt when you busy wait? I don't think so.
[Pixel]I want to let the IRQ handler to update the last_seq in case of that KIQ is used by other functionality with fence. Is it necessary?
>
>>   	amdgpu_ring_commit(ring);
>> -	mutex_unlock(&kiq->ring_mutex);
>> -
>> -	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>> -	dma_fence_put(f);
>> -	if (r < 1) {
>> -		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>> +	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>> +	do {
>> +		seq = amdgpu_fence_read(ring);
>> +		udelay(5);
>> +		timeout -= 5;
>> +	} while (seq < wait_seq && timeout > 0);
>
>You need to correctly handle sequence wrap around here.
[Pixel] yes thanks, but I didn’t see similar case handling in normal fence handling codepath, Is there? So I assumed 32bit fence seq could not overflow for years.
>
>> +
>> +	if (timeout < 1) {
>> +		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
>>   		return ~0;
>>   	}
>> -
>>   	val = adev->wb.wb[adev->virt.reg_val_offs];
>>   
>>   	return val;
>> @@ -142,24 +146,33 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
>>   
>>   void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)
>>   {
>> -	signed long r;
>> +	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>> +	unsigned long flags;
>> +	uint32_t seq, wait_seq;
>>   	struct dma_fence *f;
>>   	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>>   	struct amdgpu_ring *ring = &kiq->ring;
>>   
>>   	BUG_ON(!ring->funcs->emit_wreg);
>>   
>> -	mutex_lock(&kiq->ring_mutex);
>> +	spin_lock_irqsave(&kiq->ring_lock, flags);
>>   	amdgpu_ring_alloc(ring, 32);
>>   	amdgpu_ring_emit_wreg(ring, reg, v);
>> -	amdgpu_fence_emit(ring, &f);
>> +	/* amdgpu_fence_emit(ring, &f); */
>> +	wait_seq = ++ring->fence_drv.sync_seq;
>> +	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>> +			       wait_seq, AMDGPU_FENCE_FLAG_INT);
>>   	amdgpu_ring_commit(ring);
>> -	mutex_unlock(&kiq->ring_mutex);
>> +	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>> +
>> +	do {
>> +		seq = amdgpu_fence_read(ring);
>> +		udelay(5);
>> +		timeout -= 5;
>> +	} while (seq < wait_seq && timeout > 0);
>
>Same here. It would probably be better to have a function for this in 
>amdgpu_fence.c
[Pixel] get it, will add a wrapper.
>
>Regards,
>Christian.
>
>>   
>> -	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>> -	if (r < 1)
>> -		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>> -	dma_fence_put(f);
>> +	if (timeout < 1)
>> +		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
>>   }
>>   
>>   /**
>
>
>— 
>Sincerely Yours,
>Pixel
>
>
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]                             ` <BLUPR12MB0449C09F747B3C0294345DC484480-7LeqcoF/hwpTIQvHjXdJlwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2017-10-13  9:36                               ` Ding, Pixel
       [not found]                                 ` <14A1E84C-7C37-4901-AAAC-7082CD0BECD2-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Ding, Pixel @ 2017-10-13  9:36 UTC (permalink / raw)
  To: Liu, Monk, Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Li, Bingley

Get it.
— 
Sincerely Yours,
Pixel








On 13/10/2017, 5:28 PM, "Liu, Monk" <Monk.Liu@amd.com> wrote:

>@Ding, Pixel
>
>+		seq_printf(m, "Last preempted      0x%08x\n",
>+			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>
>Please handle other type fence as well:
>
>Preempted fence
>Reset fence
>Reset and preempted fence
>
>
>BR Monk
>-----Original Message-----
>From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Christian K?nig
>Sent: 2017年10月13日 17:10
>To: Ding, Pixel <Pixel.Ding@amd.com>; amd-gfx@lists.freedesktop.org
>Cc: Li, Bingley <Bingley.Li@amd.com>
>Subject: Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
>
>Hi Pixel,
>
>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred.
>That is correct.
>
>But my question is why you want to print a different value here:
>> +		seq_printf(m, "Last preempted      0x%08x\n",
>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>
>That code prints two dw after the address where the CPU writes the seqno. Where is the code which setups the CP to do this?
>
>As far as I can see that line should always print just zero (or some random number if the memory is actually not initialized to anything).
>
>Regards,
>Christian.
>
>Am 13.10.2017 um 11:03 schrieb Ding, Pixel:
>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred. Then if there is a value here we can generally know packet with which fence is preempted. Should driver handle any other thing for this?
>> 					
>> 				
>> 			
>> 		
>> 	
>>
>>
>> —
>> Sincerely Yours,
>> Pixel
>>
>>
>>
>>
>>
>>
>>
>> On 13/10/2017, 4:51 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:
>>
>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>> The question is where is this code for Tonga and Vega? I can't find a 
>>> reference to fence_offs in the GFX8 nor GFX9 code we have in 
>>> amd-staging-drm-next.
>>>
>>> And if it doesn't depend on the fence_offs in the ring printing it 
>>> like this is just pure speculation.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 13.10.2017 um 10:41 schrieb Ding, Pixel:
>>>> Thanks Christian,
>>>>
>>>> I’m not sure if I get your point, but yes the preemption fence offset could be changed.
>>>>
>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>>
>>>>
>>>> —
>>>> Sincerely Yours,
>>>> Pixel
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>>>>
>>>>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>>>>> From: pding <Pixel.Ding@amd.com>
>>>>>>
>>>>>> Only report fence for GFX ring. This can help checking MCBP feature.
>>>>>>
>>>>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>>>>>     1 file changed, 7 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>> index 09d5a5c..2044758 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>>>>>     			   atomic_read(&ring->fence_drv.last_seq));
>>>>>>     		seq_printf(m, "Last emitted        0x%08x\n",
>>>>>>     			   ring->fence_drv.sync_seq);
>>>>>> +
>>>>>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>>>>>> +			break;
>>>>> That should probably be "continue" instead of break, or otherwise 
>>>>> you don't print the other fences any more.
>>>>>
>>>>>> +
>>>>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>>>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>>>> Is the code to put the preemption fence there already upstream?
>>>>>
>>>>> If yes do we really do this like that for all supported generations?
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> +
>>>>>>     	}
>>>>>>     	return 0;
>>>>>>     }
>
>
>_______________________________________________
>amd-gfx mailing list
>amd-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
       [not found]         ` <BLUPR12MB0449383454665A6D4916951884480-7LeqcoF/hwpTIQvHjXdJlwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2017-10-13  9:38           ` Ding, Pixel
       [not found]             ` <ADFC5FB7-0904-434D-A950-A699B98B2921-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Ding, Pixel @ 2017-10-13  9:38 UTC (permalink / raw)
  To: Liu, Monk, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Li, Bingley

I’m afraid there’s racing issue if polling and IRQ use cases are mixed at the same time.

The original implementation is as your suggested. Is there any benefit to keep to sleepy version?
— 
Sincerely Yours,
Pixel








On 13/10/2017, 5:34 PM, "Liu, Monk" <Monk.Liu@amd.com> wrote:

>Pixel
>
>I don't think this will work well, my suggestion is you add a new function like:
>amdgpu_wreg_kiq_busy(),
>
>which will write registers through KIQ and use polling/busy wait, and the original amdgu_wreg_no_kiq() can be still there.
>
>When you need to disable sleep like in IRQ CONTEXT, you can call wreg_kiq_busy() or wreg_no_kiq(),
>
>But don't just change the original function 
>
>BR Monk
>
>-----Original Message-----
>From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Pixel Ding
>Sent: 2017年10月13日 16:26
>To: amd-gfx@lists.freedesktop.org
>Cc: Ding, Pixel <Pixel.Ding@amd.com>; Li, Bingley <Bingley.Li@amd.com>
>Subject: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
>
>From: pding <Pixel.Ding@amd.com>
>
>Register accessing is performed when IRQ is disabled. Never sleep in this function.
>
>Known issue: dead sleep in many use cases of index/data registers.
>
>Signed-off-by: pding <Pixel.Ding@amd.com>
>---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  8 ++---  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    |  1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 55 ++++++++++++++++++------------
> 6 files changed, 40 insertions(+), 28 deletions(-)
>
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>index ca212ef..f9de756 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>@@ -886,6 +886,7 @@ struct amdgpu_kiq {
> 	u64			eop_gpu_addr;
> 	struct amdgpu_bo	*eop_obj;
> 	struct mutex		ring_mutex;
>+	spinlock_t              ring_lock;
> 	struct amdgpu_ring	ring;
> 	struct amdgpu_irq_src	irq;
> };
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>index ab8f0d6..1197274 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>@@ -109,10 +109,8 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,  {
> 	uint32_t ret;
> 
>-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>-		BUG_ON(in_interrupt());
>+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
> 		return amdgpu_virt_kiq_rreg(adev, reg);
>-	}
> 
> 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
> 		ret = readl(((void __iomem *)adev->rmmio) + (reg * 4)); @@ -137,10 +135,8 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
> 		adev->last_mm_index = v;
> 	}
> 
>-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>-		BUG_ON(in_interrupt());
>+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
> 		return amdgpu_virt_kiq_wreg(adev, reg, v);
>-	}
> 
> 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
> 		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4)); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>index 2044758..c6c7bf3 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>@@ -109,7 +109,7 @@ static void amdgpu_fence_write(struct amdgpu_ring *ring, u32 seq)
>  * Reads a fence value from memory (all asics).
>  * Returns the value of the fence read from memory.
>  */
>-static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>+u32 amdgpu_fence_read(struct amdgpu_ring *ring)
> {
> 	struct amdgpu_fence_driver *drv = &ring->fence_drv;
> 	u32 seq = 0;
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>index 4f6c68f..46fa88c 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>@@ -186,6 +186,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
> 	int r = 0;
> 
> 	mutex_init(&kiq->ring_mutex);
>+	spin_lock_init(&kiq->ring_lock);
> 
> 	r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
> 	if (r)
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>index af8e544..a4fa923 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>@@ -89,6 +89,7 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,  void amdgpu_fence_driver_suspend(struct amdgpu_device *adev);  void amdgpu_fence_driver_resume(struct amdgpu_device *adev);  int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence);
>+u32 amdgpu_fence_read(struct amdgpu_ring *ring);
> void amdgpu_fence_process(struct amdgpu_ring *ring);  int amdgpu_fence_wait_empty(struct amdgpu_ring *ring);  unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>index ab05121..9757df1 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>@@ -22,7 +22,7 @@
>  */
> 
> #include "amdgpu.h"
>-#define MAX_KIQ_REG_WAIT	100000
>+#define MAX_KIQ_REG_WAIT	100000000 /* in usecs */
> 
> int amdgpu_allocate_static_csa(struct amdgpu_device *adev)  { @@ -113,28 +113,32 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
> 
> uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)  {
>-	signed long r;
>-	uint32_t val;
>-	struct dma_fence *f;
>+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>+	unsigned long flags;
>+	uint32_t val, seq, wait_seq;
> 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
> 	struct amdgpu_ring *ring = &kiq->ring;
> 
> 	BUG_ON(!ring->funcs->emit_rreg);
> 
>-	mutex_lock(&kiq->ring_mutex);
>+	spin_lock_irqsave(&kiq->ring_lock, flags);
> 	amdgpu_ring_alloc(ring, 32);
> 	amdgpu_ring_emit_rreg(ring, reg);
>-	amdgpu_fence_emit(ring, &f);
>+	wait_seq = ++ring->fence_drv.sync_seq;
>+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
> 	amdgpu_ring_commit(ring);
>-	mutex_unlock(&kiq->ring_mutex);
>-
>-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>-	dma_fence_put(f);
>-	if (r < 1) {
>-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>+	do {
>+		seq = amdgpu_fence_read(ring);
>+		udelay(5);
>+		timeout -= 5;
>+	} while (seq < wait_seq && timeout > 0);
>+
>+	if (timeout < 1) {
>+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
> 		return ~0;
> 	}
>-
> 	val = adev->wb.wb[adev->virt.reg_val_offs];
> 
> 	return val;
>@@ -142,24 +146,33 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
> 
> void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)  {
>-	signed long r;
>+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>+	unsigned long flags;
>+	uint32_t seq, wait_seq;
> 	struct dma_fence *f;
> 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
> 	struct amdgpu_ring *ring = &kiq->ring;
> 
> 	BUG_ON(!ring->funcs->emit_wreg);
> 
>-	mutex_lock(&kiq->ring_mutex);
>+	spin_lock_irqsave(&kiq->ring_lock, flags);
> 	amdgpu_ring_alloc(ring, 32);
> 	amdgpu_ring_emit_wreg(ring, reg, v);
>-	amdgpu_fence_emit(ring, &f);
>+	/* amdgpu_fence_emit(ring, &f); */
>+	wait_seq = ++ring->fence_drv.sync_seq;
>+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
> 	amdgpu_ring_commit(ring);
>-	mutex_unlock(&kiq->ring_mutex);
>+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>+
>+	do {
>+		seq = amdgpu_fence_read(ring);
>+		udelay(5);
>+		timeout -= 5;
>+	} while (seq < wait_seq && timeout > 0);
> 
>-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>-	if (r < 1)
>-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>-	dma_fence_put(f);
>+	if (timeout < 1)
>+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
> }
> 
> /**
>--
>2.9.5
>
>_______________________________________________
>amd-gfx mailing list
>amd-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
       [not found]             ` <ADFC5FB7-0904-434D-A950-A699B98B2921-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-13 10:02               ` Liu, Monk
  2017-10-13 10:04               ` Liu, Monk
  1 sibling, 0 replies; 32+ messages in thread
From: Liu, Monk @ 2017-10-13 10:02 UTC (permalink / raw)
  To: Ding, Pixel, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Li, Bingley

The busy waiting method somehow make the performance worse considering 16 VF and vf0 need to wait 15*time_slice when it try to access registers

BR

-----Original Message-----
From: Ding, Pixel 
Sent: 2017年10月13日 17:39
To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Li, Bingley <Bingley.Li@amd.com>
Subject: Re: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing

I’m afraid there’s racing issue if polling and IRQ use cases are mixed at the same time.

The original implementation is as your suggested. Is there any benefit to keep to sleepy version?
— 
Sincerely Yours,
Pixel








On 13/10/2017, 5:34 PM, "Liu, Monk" <Monk.Liu@amd.com> wrote:

>Pixel
>
>I don't think this will work well, my suggestion is you add a new function like:
>amdgpu_wreg_kiq_busy(),
>
>which will write registers through KIQ and use polling/busy wait, and the original amdgu_wreg_no_kiq() can be still there.
>
>When you need to disable sleep like in IRQ CONTEXT, you can call wreg_kiq_busy() or wreg_no_kiq(),
>
>But don't just change the original function 
>
>BR Monk
>
>-----Original Message-----
>From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Pixel Ding
>Sent: 2017年10月13日 16:26
>To: amd-gfx@lists.freedesktop.org
>Cc: Ding, Pixel <Pixel.Ding@amd.com>; Li, Bingley <Bingley.Li@amd.com>
>Subject: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
>
>From: pding <Pixel.Ding@amd.com>
>
>Register accessing is performed when IRQ is disabled. Never sleep in this function.
>
>Known issue: dead sleep in many use cases of index/data registers.
>
>Signed-off-by: pding <Pixel.Ding@amd.com>
>---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  8 ++---  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    |  1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 55 ++++++++++++++++++------------
> 6 files changed, 40 insertions(+), 28 deletions(-)
>
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>index ca212ef..f9de756 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>@@ -886,6 +886,7 @@ struct amdgpu_kiq {
> 	u64			eop_gpu_addr;
> 	struct amdgpu_bo	*eop_obj;
> 	struct mutex		ring_mutex;
>+	spinlock_t              ring_lock;
> 	struct amdgpu_ring	ring;
> 	struct amdgpu_irq_src	irq;
> };
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>index ab8f0d6..1197274 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>@@ -109,10 +109,8 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,  {
> 	uint32_t ret;
> 
>-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>-		BUG_ON(in_interrupt());
>+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
> 		return amdgpu_virt_kiq_rreg(adev, reg);
>-	}
> 
> 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
> 		ret = readl(((void __iomem *)adev->rmmio) + (reg * 4)); @@ -137,10 +135,8 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
> 		adev->last_mm_index = v;
> 	}
> 
>-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>-		BUG_ON(in_interrupt());
>+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
> 		return amdgpu_virt_kiq_wreg(adev, reg, v);
>-	}
> 
> 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
> 		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4)); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>index 2044758..c6c7bf3 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>@@ -109,7 +109,7 @@ static void amdgpu_fence_write(struct amdgpu_ring *ring, u32 seq)
>  * Reads a fence value from memory (all asics).
>  * Returns the value of the fence read from memory.
>  */
>-static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>+u32 amdgpu_fence_read(struct amdgpu_ring *ring)
> {
> 	struct amdgpu_fence_driver *drv = &ring->fence_drv;
> 	u32 seq = 0;
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>index 4f6c68f..46fa88c 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>@@ -186,6 +186,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
> 	int r = 0;
> 
> 	mutex_init(&kiq->ring_mutex);
>+	spin_lock_init(&kiq->ring_lock);
> 
> 	r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
> 	if (r)
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>index af8e544..a4fa923 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>@@ -89,6 +89,7 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,  void amdgpu_fence_driver_suspend(struct amdgpu_device *adev);  void amdgpu_fence_driver_resume(struct amdgpu_device *adev);  int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence);
>+u32 amdgpu_fence_read(struct amdgpu_ring *ring);
> void amdgpu_fence_process(struct amdgpu_ring *ring);  int amdgpu_fence_wait_empty(struct amdgpu_ring *ring);  unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>index ab05121..9757df1 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>@@ -22,7 +22,7 @@
>  */
> 
> #include "amdgpu.h"
>-#define MAX_KIQ_REG_WAIT	100000
>+#define MAX_KIQ_REG_WAIT	100000000 /* in usecs */
> 
> int amdgpu_allocate_static_csa(struct amdgpu_device *adev)  { @@ -113,28 +113,32 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
> 
> uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)  {
>-	signed long r;
>-	uint32_t val;
>-	struct dma_fence *f;
>+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>+	unsigned long flags;
>+	uint32_t val, seq, wait_seq;
> 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
> 	struct amdgpu_ring *ring = &kiq->ring;
> 
> 	BUG_ON(!ring->funcs->emit_rreg);
> 
>-	mutex_lock(&kiq->ring_mutex);
>+	spin_lock_irqsave(&kiq->ring_lock, flags);
> 	amdgpu_ring_alloc(ring, 32);
> 	amdgpu_ring_emit_rreg(ring, reg);
>-	amdgpu_fence_emit(ring, &f);
>+	wait_seq = ++ring->fence_drv.sync_seq;
>+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
> 	amdgpu_ring_commit(ring);
>-	mutex_unlock(&kiq->ring_mutex);
>-
>-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>-	dma_fence_put(f);
>-	if (r < 1) {
>-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>+	do {
>+		seq = amdgpu_fence_read(ring);
>+		udelay(5);
>+		timeout -= 5;
>+	} while (seq < wait_seq && timeout > 0);
>+
>+	if (timeout < 1) {
>+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
> 		return ~0;
> 	}
>-
> 	val = adev->wb.wb[adev->virt.reg_val_offs];
> 
> 	return val;
>@@ -142,24 +146,33 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
> 
> void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)  {
>-	signed long r;
>+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>+	unsigned long flags;
>+	uint32_t seq, wait_seq;
> 	struct dma_fence *f;
> 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
> 	struct amdgpu_ring *ring = &kiq->ring;
> 
> 	BUG_ON(!ring->funcs->emit_wreg);
> 
>-	mutex_lock(&kiq->ring_mutex);
>+	spin_lock_irqsave(&kiq->ring_lock, flags);
> 	amdgpu_ring_alloc(ring, 32);
> 	amdgpu_ring_emit_wreg(ring, reg, v);
>-	amdgpu_fence_emit(ring, &f);
>+	/* amdgpu_fence_emit(ring, &f); */
>+	wait_seq = ++ring->fence_drv.sync_seq;
>+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
> 	amdgpu_ring_commit(ring);
>-	mutex_unlock(&kiq->ring_mutex);
>+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>+
>+	do {
>+		seq = amdgpu_fence_read(ring);
>+		udelay(5);
>+		timeout -= 5;
>+	} while (seq < wait_seq && timeout > 0);
> 
>-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>-	if (r < 1)
>-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>-	dma_fence_put(f);
>+	if (timeout < 1)
>+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
> }
> 
> /**
>--
>2.9.5
>
>_______________________________________________
>amd-gfx mailing list
>amd-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
       [not found]             ` <ADFC5FB7-0904-434D-A950-A699B98B2921-5C7GfCeVMHo@public.gmane.org>
  2017-10-13 10:02               ` Liu, Monk
@ 2017-10-13 10:04               ` Liu, Monk
       [not found]                 ` <BLUPR12MB0449E6C72D64821D235E3F1184480-7LeqcoF/hwpTIQvHjXdJlwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  1 sibling, 1 reply; 32+ messages in thread
From: Liu, Monk @ 2017-10-13 10:04 UTC (permalink / raw)
  To: Ding, Pixel, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Li, Bingley

Why there will be racing issue ?

Polling or sleep wait only have different result for the caller, not the job scheduled to KIQ 

The sleep waiting is synchroniz sleep, it just release CPU resource to other process/thread, so the order is guaranteed 

BR Monk

-----Original Message-----
From: Ding, Pixel 
Sent: 2017年10月13日 17:39
To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Li, Bingley <Bingley.Li@amd.com>
Subject: Re: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing

I’m afraid there’s racing issue if polling and IRQ use cases are mixed at the same time.

The original implementation is as your suggested. Is there any benefit to keep to sleepy version?
— 
Sincerely Yours,
Pixel








On 13/10/2017, 5:34 PM, "Liu, Monk" <Monk.Liu@amd.com> wrote:

>Pixel
>
>I don't think this will work well, my suggestion is you add a new function like:
>amdgpu_wreg_kiq_busy(),
>
>which will write registers through KIQ and use polling/busy wait, and the original amdgu_wreg_no_kiq() can be still there.
>
>When you need to disable sleep like in IRQ CONTEXT, you can call wreg_kiq_busy() or wreg_no_kiq(),
>
>But don't just change the original function 
>
>BR Monk
>
>-----Original Message-----
>From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Pixel Ding
>Sent: 2017年10月13日 16:26
>To: amd-gfx@lists.freedesktop.org
>Cc: Ding, Pixel <Pixel.Ding@amd.com>; Li, Bingley <Bingley.Li@amd.com>
>Subject: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
>
>From: pding <Pixel.Ding@amd.com>
>
>Register accessing is performed when IRQ is disabled. Never sleep in this function.
>
>Known issue: dead sleep in many use cases of index/data registers.
>
>Signed-off-by: pding <Pixel.Ding@amd.com>
>---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  8 ++---  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    |  1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 55 ++++++++++++++++++------------
> 6 files changed, 40 insertions(+), 28 deletions(-)
>
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>index ca212ef..f9de756 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>@@ -886,6 +886,7 @@ struct amdgpu_kiq {
> 	u64			eop_gpu_addr;
> 	struct amdgpu_bo	*eop_obj;
> 	struct mutex		ring_mutex;
>+	spinlock_t              ring_lock;
> 	struct amdgpu_ring	ring;
> 	struct amdgpu_irq_src	irq;
> };
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>index ab8f0d6..1197274 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>@@ -109,10 +109,8 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,  {
> 	uint32_t ret;
> 
>-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>-		BUG_ON(in_interrupt());
>+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
> 		return amdgpu_virt_kiq_rreg(adev, reg);
>-	}
> 
> 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
> 		ret = readl(((void __iomem *)adev->rmmio) + (reg * 4)); @@ -137,10 +135,8 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
> 		adev->last_mm_index = v;
> 	}
> 
>-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>-		BUG_ON(in_interrupt());
>+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
> 		return amdgpu_virt_kiq_wreg(adev, reg, v);
>-	}
> 
> 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
> 		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4)); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>index 2044758..c6c7bf3 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>@@ -109,7 +109,7 @@ static void amdgpu_fence_write(struct amdgpu_ring *ring, u32 seq)
>  * Reads a fence value from memory (all asics).
>  * Returns the value of the fence read from memory.
>  */
>-static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>+u32 amdgpu_fence_read(struct amdgpu_ring *ring)
> {
> 	struct amdgpu_fence_driver *drv = &ring->fence_drv;
> 	u32 seq = 0;
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>index 4f6c68f..46fa88c 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>@@ -186,6 +186,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
> 	int r = 0;
> 
> 	mutex_init(&kiq->ring_mutex);
>+	spin_lock_init(&kiq->ring_lock);
> 
> 	r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
> 	if (r)
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>index af8e544..a4fa923 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>@@ -89,6 +89,7 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,  void amdgpu_fence_driver_suspend(struct amdgpu_device *adev);  void amdgpu_fence_driver_resume(struct amdgpu_device *adev);  int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence);
>+u32 amdgpu_fence_read(struct amdgpu_ring *ring);
> void amdgpu_fence_process(struct amdgpu_ring *ring);  int amdgpu_fence_wait_empty(struct amdgpu_ring *ring);  unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>index ab05121..9757df1 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>@@ -22,7 +22,7 @@
>  */
> 
> #include "amdgpu.h"
>-#define MAX_KIQ_REG_WAIT	100000
>+#define MAX_KIQ_REG_WAIT	100000000 /* in usecs */
> 
> int amdgpu_allocate_static_csa(struct amdgpu_device *adev)  { @@ -113,28 +113,32 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
> 
> uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)  {
>-	signed long r;
>-	uint32_t val;
>-	struct dma_fence *f;
>+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>+	unsigned long flags;
>+	uint32_t val, seq, wait_seq;
> 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
> 	struct amdgpu_ring *ring = &kiq->ring;
> 
> 	BUG_ON(!ring->funcs->emit_rreg);
> 
>-	mutex_lock(&kiq->ring_mutex);
>+	spin_lock_irqsave(&kiq->ring_lock, flags);
> 	amdgpu_ring_alloc(ring, 32);
> 	amdgpu_ring_emit_rreg(ring, reg);
>-	amdgpu_fence_emit(ring, &f);
>+	wait_seq = ++ring->fence_drv.sync_seq;
>+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
> 	amdgpu_ring_commit(ring);
>-	mutex_unlock(&kiq->ring_mutex);
>-
>-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>-	dma_fence_put(f);
>-	if (r < 1) {
>-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>+	do {
>+		seq = amdgpu_fence_read(ring);
>+		udelay(5);
>+		timeout -= 5;
>+	} while (seq < wait_seq && timeout > 0);
>+
>+	if (timeout < 1) {
>+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
> 		return ~0;
> 	}
>-
> 	val = adev->wb.wb[adev->virt.reg_val_offs];
> 
> 	return val;
>@@ -142,24 +146,33 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
> 
> void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)  {
>-	signed long r;
>+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>+	unsigned long flags;
>+	uint32_t seq, wait_seq;
> 	struct dma_fence *f;
> 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
> 	struct amdgpu_ring *ring = &kiq->ring;
> 
> 	BUG_ON(!ring->funcs->emit_wreg);
> 
>-	mutex_lock(&kiq->ring_mutex);
>+	spin_lock_irqsave(&kiq->ring_lock, flags);
> 	amdgpu_ring_alloc(ring, 32);
> 	amdgpu_ring_emit_wreg(ring, reg, v);
>-	amdgpu_fence_emit(ring, &f);
>+	/* amdgpu_fence_emit(ring, &f); */
>+	wait_seq = ++ring->fence_drv.sync_seq;
>+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
> 	amdgpu_ring_commit(ring);
>-	mutex_unlock(&kiq->ring_mutex);
>+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>+
>+	do {
>+		seq = amdgpu_fence_read(ring);
>+		udelay(5);
>+		timeout -= 5;
>+	} while (seq < wait_seq && timeout > 0);
> 
>-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>-	if (r < 1)
>-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>-	dma_fence_put(f);
>+	if (timeout < 1)
>+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
> }
> 
> /**
>--
>2.9.5
>
>_______________________________________________
>amd-gfx mailing list
>amd-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]                                 ` <14A1E84C-7C37-4901-AAAC-7082CD0BECD2-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-13 10:14                                   ` Christian König
       [not found]                                     ` <930af7f8-9684-8dc6-cdeb-e5ffff5f811d-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Christian König @ 2017-10-13 10:14 UTC (permalink / raw)
  To: Ding, Pixel, Liu, Monk, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Li, Bingley

Sounds logical to me as well.

Please also add a document describing what the CP does here.

BTW: Is that limited to the GFX pipeline or do the Compute pipes the same?

Regards,
Christian.

Am 13.10.2017 um 11:36 schrieb Ding, Pixel:
> Get it.
> —
> Sincerely Yours,
> Pixel
>
>
>
>
>
>
>
>
> On 13/10/2017, 5:28 PM, "Liu, Monk" <Monk.Liu@amd.com> wrote:
>
>> @Ding, Pixel
>>
>> +		seq_printf(m, "Last preempted      0x%08x\n",
>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>
>> Please handle other type fence as well:
>>
>> Preempted fence
>> Reset fence
>> Reset and preempted fence
>>
>>
>> BR Monk
>> -----Original Message-----
>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Christian K?nig
>> Sent: 2017年10月13日 17:10
>> To: Ding, Pixel <Pixel.Ding@amd.com>; amd-gfx@lists.freedesktop.org
>> Cc: Li, Bingley <Bingley.Li@amd.com>
>> Subject: Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
>>
>> Hi Pixel,
>>
>>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred.
>> That is correct.
>>
>> But my question is why you want to print a different value here:
>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>> That code prints two dw after the address where the CPU writes the seqno. Where is the code which setups the CP to do this?
>>
>> As far as I can see that line should always print just zero (or some random number if the memory is actually not initialized to anything).
>>
>> Regards,
>> Christian.
>>
>> Am 13.10.2017 um 11:03 schrieb Ding, Pixel:
>>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred. Then if there is a value here we can generally know packet with which fence is preempted. Should driver handle any other thing for this?
>>> 					
>>> 				
>>> 			
>>> 		
>>> 	
>>>
>>>
>>> —
>>> Sincerely Yours,
>>> Pixel
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 13/10/2017, 4:51 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:
>>>
>>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>> The question is where is this code for Tonga and Vega? I can't find a
>>>> reference to fence_offs in the GFX8 nor GFX9 code we have in
>>>> amd-staging-drm-next.
>>>>
>>>> And if it doesn't depend on the fence_offs in the ring printing it
>>>> like this is just pure speculation.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 13.10.2017 um 10:41 schrieb Ding, Pixel:
>>>>> Thanks Christian,
>>>>>
>>>>> I’m not sure if I get your point, but yes the preemption fence offset could be changed.
>>>>>
>>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>>>
>>>>>
>>>>> —
>>>>> Sincerely Yours,
>>>>> Pixel
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>>>>>
>>>>>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>>>>>> From: pding <Pixel.Ding@amd.com>
>>>>>>>
>>>>>>> Only report fence for GFX ring. This can help checking MCBP feature.
>>>>>>>
>>>>>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>>>>>> ---
>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>>>>>>      1 file changed, 7 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>>> index 09d5a5c..2044758 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>>>>>>      			   atomic_read(&ring->fence_drv.last_seq));
>>>>>>>      		seq_printf(m, "Last emitted        0x%08x\n",
>>>>>>>      			   ring->fence_drv.sync_seq);
>>>>>>> +
>>>>>>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>>>>>>> +			break;
>>>>>> That should probably be "continue" instead of break, or otherwise
>>>>>> you don't print the other fences any more.
>>>>>>
>>>>>>> +
>>>>>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>>>>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>>>>> Is the code to put the preemption fence there already upstream?
>>>>>>
>>>>>> If yes do we really do this like that for all supported generations?
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> +
>>>>>>>      	}
>>>>>>>      	return 0;
>>>>>>>      }
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
       [not found]             ` <67E0D85B-6C27-4481-BFB9-6DF9E66A7E03-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-13 10:21               ` Christian König
  0 siblings, 0 replies; 32+ messages in thread
From: Christian König @ 2017-10-13 10:21 UTC (permalink / raw)
  To: Ding, Pixel, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Li, Bingley

Am 13.10.2017 um 11:35 schrieb Ding, Pixel:
> On 13/10/2017, 5:16 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>
>
>
>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>> From: pding <Pixel.Ding@amd.com>
>>>
>>> Register accessing is performed when IRQ is disabled. Never sleep in
>>> this function.
>>>
>>> Known issue: dead sleep in many use cases of index/data registers.
>>>
>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  8 ++---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  2 +-
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    |  1 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  1 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 55 ++++++++++++++++++------------
>>>    6 files changed, 40 insertions(+), 28 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index ca212ef..f9de756 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -886,6 +886,7 @@ struct amdgpu_kiq {
>>>    	u64			eop_gpu_addr;
>>>    	struct amdgpu_bo	*eop_obj;
>>>    	struct mutex		ring_mutex;
>> You can remove the ring_mutex if you don't use it any more.
> [Pixel] KFD still use the mutex, I didn’t change it also to spin lock, should I?

Yes, when they use the mutex as well we need to change that to.

Otherwise we run into bad trouble when both try to access the same ring.

BTW: Do they also use the fence? That would be rather problematic.

>>> +	spinlock_t              ring_lock;
>>>    	struct amdgpu_ring	ring;
>>>    	struct amdgpu_irq_src	irq;
>>>    };
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index ab8f0d6..1197274 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -109,10 +109,8 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,
>>>    {
>>>    	uint32_t ret;
>>>    
>>> -	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>>> -		BUG_ON(in_interrupt());
>>> +	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
>>>    		return amdgpu_virt_kiq_rreg(adev, reg);
>>> -	}
>>>    
>>>    	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
>>>    		ret = readl(((void __iomem *)adev->rmmio) + (reg * 4));
>>> @@ -137,10 +135,8 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
>>>    		adev->last_mm_index = v;
>>>    	}
>>>    
>>> -	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>>> -		BUG_ON(in_interrupt());
>>> +	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
>>>    		return amdgpu_virt_kiq_wreg(adev, reg, v);
>>> -	}
>>>    
>>>    	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
>>>    		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>> index 2044758..c6c7bf3 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>> @@ -109,7 +109,7 @@ static void amdgpu_fence_write(struct amdgpu_ring *ring, u32 seq)
>>>     * Reads a fence value from memory (all asics).
>>>     * Returns the value of the fence read from memory.
>>>     */
>>> -static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>>> +u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>>>    {
>>>    	struct amdgpu_fence_driver *drv = &ring->fence_drv;
>>>    	u32 seq = 0;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>> index 4f6c68f..46fa88c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>> @@ -186,6 +186,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
>>>    	int r = 0;
>>>    
>>>    	mutex_init(&kiq->ring_mutex);
>>> +	spin_lock_init(&kiq->ring_lock);
>>>    
>>>    	r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
>>>    	if (r)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> index af8e544..a4fa923 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> @@ -89,6 +89,7 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
>>>    void amdgpu_fence_driver_suspend(struct amdgpu_device *adev);
>>>    void amdgpu_fence_driver_resume(struct amdgpu_device *adev);
>>>    int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence);
>>> +u32 amdgpu_fence_read(struct amdgpu_ring *ring);
>>>    void amdgpu_fence_process(struct amdgpu_ring *ring);
>>>    int amdgpu_fence_wait_empty(struct amdgpu_ring *ring);
>>>    unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>> index ab05121..9757df1 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>> @@ -22,7 +22,7 @@
>>>     */
>>>    
>>>    #include "amdgpu.h"
>>> -#define MAX_KIQ_REG_WAIT	100000
>>> +#define MAX_KIQ_REG_WAIT	100000000 /* in usecs */
>> You want to busy wait 100 seconds for an result?
>>
>> Forget it that is way to long you will certainly run into problems with
>> the NMI much earlier.
>> [Pixel] thanks, would change it to 10s.

Still way to long. I think the maximum we can have here is 1s, maybe 
even only 100ms.

>>
>>>    
>>>    int amdgpu_allocate_static_csa(struct amdgpu_device *adev)
>>>    {
>>> @@ -113,28 +113,32 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
>>>    
>>>    uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
>>>    {
>>> -	signed long r;
>>> -	uint32_t val;
>>> -	struct dma_fence *f;
>>> +	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>>> +	unsigned long flags;
>>> +	uint32_t val, seq, wait_seq;
>>>    	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>>>    	struct amdgpu_ring *ring = &kiq->ring;
>>>    
>>>    	BUG_ON(!ring->funcs->emit_rreg);
>>>    
>>> -	mutex_lock(&kiq->ring_mutex);
>>> +	spin_lock_irqsave(&kiq->ring_lock, flags);
>>>    	amdgpu_ring_alloc(ring, 32);
>>>    	amdgpu_ring_emit_rreg(ring, reg);
>>> -	amdgpu_fence_emit(ring, &f);
>>> +	wait_seq = ++ring->fence_drv.sync_seq;
>>> +	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>>> +			       wait_seq, AMDGPU_FENCE_FLAG_INT);
>> Do you still need to send an interrupt when you busy wait? I don't think so.
> [Pixel]I want to let the IRQ handler to update the last_seq in case of that KIQ is used by other functionality with fence. Is it necessary?

Good point. In this case just keep it for now.

Might as well be that the KFD still needs the fence code.

Please double check that, cause that would make things much more 
complicated.

>>>    	amdgpu_ring_commit(ring);
>>> -	mutex_unlock(&kiq->ring_mutex);
>>> -
>>> -	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>>> -	dma_fence_put(f);
>>> -	if (r < 1) {
>>> -		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>>> +	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>>> +	do {
>>> +		seq = amdgpu_fence_read(ring);
>>> +		udelay(5);
>>> +		timeout -= 5;
>>> +	} while (seq < wait_seq && timeout > 0);
>> You need to correctly handle sequence wrap around here.
> [Pixel] yes thanks, but I didn’t see similar case handling in normal fence handling codepath, Is there? So I assumed 32bit fence seq could not overflow for years.

The normal fence handling code uses a circular buffer, so overflow is 
actually welcome there and doesn't need special handling.

But 2^32 register writes sounds like something we might actually hit 
sooner or later.

Regards,
Christian.

>>> +
>>> +	if (timeout < 1) {
>>> +		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
>>>    		return ~0;
>>>    	}
>>> -
>>>    	val = adev->wb.wb[adev->virt.reg_val_offs];
>>>    
>>>    	return val;
>>> @@ -142,24 +146,33 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
>>>    
>>>    void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)
>>>    {
>>> -	signed long r;
>>> +	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>>> +	unsigned long flags;
>>> +	uint32_t seq, wait_seq;
>>>    	struct dma_fence *f;
>>>    	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>>>    	struct amdgpu_ring *ring = &kiq->ring;
>>>    
>>>    	BUG_ON(!ring->funcs->emit_wreg);
>>>    
>>> -	mutex_lock(&kiq->ring_mutex);
>>> +	spin_lock_irqsave(&kiq->ring_lock, flags);
>>>    	amdgpu_ring_alloc(ring, 32);
>>>    	amdgpu_ring_emit_wreg(ring, reg, v);
>>> -	amdgpu_fence_emit(ring, &f);
>>> +	/* amdgpu_fence_emit(ring, &f); */
>>> +	wait_seq = ++ring->fence_drv.sync_seq;
>>> +	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>>> +			       wait_seq, AMDGPU_FENCE_FLAG_INT);
>>>    	amdgpu_ring_commit(ring);
>>> -	mutex_unlock(&kiq->ring_mutex);
>>> +	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>>> +
>>> +	do {
>>> +		seq = amdgpu_fence_read(ring);
>>> +		udelay(5);
>>> +		timeout -= 5;
>>> +	} while (seq < wait_seq && timeout > 0);
>> Same here. It would probably be better to have a function for this in
>> amdgpu_fence.c
> [Pixel] get it, will add a wrapper.
>> Regards,
>> Christian.
>>
>>>    
>>> -	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>>> -	if (r < 1)
>>> -		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>>> -	dma_fence_put(f);
>>> +	if (timeout < 1)
>>> +		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
>>>    }
>>>    
>>>    /**
>>
>> —
>> Sincerely Yours,
>> Pixel
>>
>>
>>
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info
       [not found]                                     ` <930af7f8-9684-8dc6-cdeb-e5ffff5f811d-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-13 11:14                                       ` Liu, Monk
  0 siblings, 0 replies; 32+ messages in thread
From: Liu, Monk @ 2017-10-13 11:14 UTC (permalink / raw)
  To: Koenig, Christian, Ding, Pixel, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Li, Bingley

Only the gfx engine has such preemption fence mechanism in CP 

-----Original Message-----
From: Koenig, Christian 
Sent: 2017年10月13日 18:15
To: Ding, Pixel <Pixel.Ding@amd.com>; Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Li, Bingley <Bingley.Li@amd.com>
Subject: Re: [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info

Sounds logical to me as well.

Please also add a document describing what the CP does here.

BTW: Is that limited to the GFX pipeline or do the Compute pipes the same?

Regards,
Christian.

Am 13.10.2017 um 11:36 schrieb Ding, Pixel:
> Get it.
> —
> Sincerely Yours,
> Pixel
>
>
>
>
>
>
>
>
> On 13/10/2017, 5:28 PM, "Liu, Monk" <Monk.Liu@amd.com> wrote:
>
>> @Ding, Pixel
>>
>> +		seq_printf(m, "Last preempted      0x%08x\n",
>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>
>> Please handle other type fence as well:
>>
>> Preempted fence
>> Reset fence
>> Reset and preempted fence
>>
>>
>> BR Monk
>> -----Original Message-----
>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On 
>> Behalf Of Christian K?nig
>> Sent: 2017年10月13日 17:10
>> To: Ding, Pixel <Pixel.Ding@amd.com>; amd-gfx@lists.freedesktop.org
>> Cc: Li, Bingley <Bingley.Li@amd.com>
>> Subject: Re: [PATCH 3/4] drm/amdgpu: report preemption fence via 
>> amdgpu_fence_info
>>
>> Hi Pixel,
>>
>>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred.
>> That is correct.
>>
>> But my question is why you want to print a different value here:
>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>> That code prints two dw after the address where the CPU writes the seqno. Where is the code which setups the CP to do this?
>>
>> As far as I can see that line should always print just zero (or some random number if the memory is actually not initialized to anything).
>>
>> Regards,
>> Christian.
>>
>> Am 13.10.2017 um 11:03 schrieb Ding, Pixel:
>>> My understanding is that CP will write seqno back to preempted fence offset when preemption occurred. Then if there is a value here we can generally know packet with which fence is preempted. Should driver handle any other thing for this?
>>> 					
>>> 				
>>> 			
>>> 		
>>> 	
>>>
>>>
>>> —
>>> Sincerely Yours,
>>> Pixel
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 13/10/2017, 4:51 PM, "Koenig, Christian" <Christian.Koenig@amd.com> wrote:
>>>
>>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>> The question is where is this code for Tonga and Vega? I can't find 
>>>> a reference to fence_offs in the GFX8 nor GFX9 code we have in 
>>>> amd-staging-drm-next.
>>>>
>>>> And if it doesn't depend on the fence_offs in the ring printing it 
>>>> like this is just pure speculation.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 13.10.2017 um 10:41 schrieb Ding, Pixel:
>>>>> Thanks Christian,
>>>>>
>>>>> I’m not sure if I get your point, but yes the preemption fence offset could be changed.
>>>>>
>>>>> Is it OK to limit this information only for SRIOV VF on Tonga and Vega whose format is known? It can help use to identify if MCBP is working correctly or not.
>>>>>
>>>>>
>>>>> —
>>>>> Sincerely Yours,
>>>>> Pixel
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 13/10/2017, 4:34 PM, "Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:
>>>>>
>>>>>> Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>>>>>>> From: pding <Pixel.Ding@amd.com>
>>>>>>>
>>>>>>> Only report fence for GFX ring. This can help checking MCBP feature.
>>>>>>>
>>>>>>> Signed-off-by: pding <Pixel.Ding@amd.com>
>>>>>>> ---
>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++++++
>>>>>>>      1 file changed, 7 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>>> index 09d5a5c..2044758 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>>>>> @@ -645,6 +645,13 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
>>>>>>>      			   atomic_read(&ring->fence_drv.last_seq));
>>>>>>>      		seq_printf(m, "Last emitted        0x%08x\n",
>>>>>>>      			   ring->fence_drv.sync_seq);
>>>>>>> +
>>>>>>> +		if (ring->funcs->type != AMDGPU_RING_TYPE_GFX)
>>>>>>> +			break;
>>>>>> That should probably be "continue" instead of break, or otherwise 
>>>>>> you don't print the other fences any more.
>>>>>>
>>>>>>> +
>>>>>>> +		seq_printf(m, "Last preempted      0x%08x\n",
>>>>>>> +			   le32_to_cpu(*(ring->fence_drv.cpu_addr + 2)));
>>>>>> Is the code to put the preemption fence there already upstream?
>>>>>>
>>>>>> If yes do we really do this like that for all supported generations?
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> +
>>>>>>>      	}
>>>>>>>      	return 0;
>>>>>>>      }
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
       [not found]                 ` <BLUPR12MB0449E6C72D64821D235E3F1184480-7LeqcoF/hwpTIQvHjXdJlwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2017-10-16  1:25                   ` Ding, Pixel
       [not found]                     ` <FCE99400-0F48-4FA9-B714-D6B318718975-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Ding, Pixel @ 2017-10-16  1:25 UTC (permalink / raw)
  To: Liu, Monk, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Li, Bingley

OK, I would keep both methods working together.
— 
Sincerely Yours,
Pixel







On 13/10/2017, 6:04 PM, "Liu, Monk" <Monk.Liu@amd.com> wrote:

>Why there will be racing issue ?
>
>Polling or sleep wait only have different result for the caller, not the job scheduled to KIQ 
>
>The sleep waiting is synchroniz sleep, it just release CPU resource to other process/thread, so the order is guaranteed 
>
>BR Monk
>
>-----Original Message-----
>From: Ding, Pixel 
>Sent: 2017年10月13日 17:39
>To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org
>Cc: Li, Bingley <Bingley.Li@amd.com>
>Subject: Re: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
>
>I’m afraid there’s racing issue if polling and IRQ use cases are mixed at the same time.
>
>The original implementation is as your suggested. Is there any benefit to keep to sleepy version?
>— 
>Sincerely Yours,
>Pixel
>
>
>
>
>
>
>
>
>On 13/10/2017, 5:34 PM, "Liu, Monk" <Monk.Liu@amd.com> wrote:
>
>>Pixel
>>
>>I don't think this will work well, my suggestion is you add a new function like:
>>amdgpu_wreg_kiq_busy(),
>>
>>which will write registers through KIQ and use polling/busy wait, and the original amdgu_wreg_no_kiq() can be still there.
>>
>>When you need to disable sleep like in IRQ CONTEXT, you can call wreg_kiq_busy() or wreg_no_kiq(),
>>
>>But don't just change the original function 
>>
>>BR Monk
>>
>>-----Original Message-----
>>From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Pixel Ding
>>Sent: 2017年10月13日 16:26
>>To: amd-gfx@lists.freedesktop.org
>>Cc: Ding, Pixel <Pixel.Ding@amd.com>; Li, Bingley <Bingley.Li@amd.com>
>>Subject: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
>>
>>From: pding <Pixel.Ding@amd.com>
>>
>>Register accessing is performed when IRQ is disabled. Never sleep in this function.
>>
>>Known issue: dead sleep in many use cases of index/data registers.
>>
>>Signed-off-by: pding <Pixel.Ding@amd.com>
>>---
>> drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  8 ++---  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  2 +-
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    |  1 +
>> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  1 +
>> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 55 ++++++++++++++++++------------
>> 6 files changed, 40 insertions(+), 28 deletions(-)
>>
>>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>index ca212ef..f9de756 100644
>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>@@ -886,6 +886,7 @@ struct amdgpu_kiq {
>> 	u64			eop_gpu_addr;
>> 	struct amdgpu_bo	*eop_obj;
>> 	struct mutex		ring_mutex;
>>+	spinlock_t              ring_lock;
>> 	struct amdgpu_ring	ring;
>> 	struct amdgpu_irq_src	irq;
>> };
>>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>index ab8f0d6..1197274 100644
>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>@@ -109,10 +109,8 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,  {
>> 	uint32_t ret;
>> 
>>-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>>-		BUG_ON(in_interrupt());
>>+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
>> 		return amdgpu_virt_kiq_rreg(adev, reg);
>>-	}
>> 
>> 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
>> 		ret = readl(((void __iomem *)adev->rmmio) + (reg * 4)); @@ -137,10 +135,8 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
>> 		adev->last_mm_index = v;
>> 	}
>> 
>>-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>>-		BUG_ON(in_interrupt());
>>+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
>> 		return amdgpu_virt_kiq_wreg(adev, reg, v);
>>-	}
>> 
>> 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
>> 		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4)); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>index 2044758..c6c7bf3 100644
>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>@@ -109,7 +109,7 @@ static void amdgpu_fence_write(struct amdgpu_ring *ring, u32 seq)
>>  * Reads a fence value from memory (all asics).
>>  * Returns the value of the fence read from memory.
>>  */
>>-static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>>+u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>> {
>> 	struct amdgpu_fence_driver *drv = &ring->fence_drv;
>> 	u32 seq = 0;
>>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>index 4f6c68f..46fa88c 100644
>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>@@ -186,6 +186,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
>> 	int r = 0;
>> 
>> 	mutex_init(&kiq->ring_mutex);
>>+	spin_lock_init(&kiq->ring_lock);
>> 
>> 	r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
>> 	if (r)
>>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>index af8e544..a4fa923 100644
>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>@@ -89,6 +89,7 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,  void amdgpu_fence_driver_suspend(struct amdgpu_device *adev);  void amdgpu_fence_driver_resume(struct amdgpu_device *adev);  int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence);
>>+u32 amdgpu_fence_read(struct amdgpu_ring *ring);
>> void amdgpu_fence_process(struct amdgpu_ring *ring);  int amdgpu_fence_wait_empty(struct amdgpu_ring *ring);  unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>index ab05121..9757df1 100644
>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>@@ -22,7 +22,7 @@
>>  */
>> 
>> #include "amdgpu.h"
>>-#define MAX_KIQ_REG_WAIT	100000
>>+#define MAX_KIQ_REG_WAIT	100000000 /* in usecs */
>> 
>> int amdgpu_allocate_static_csa(struct amdgpu_device *adev)  { @@ -113,28 +113,32 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
>> 
>> uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)  {
>>-	signed long r;
>>-	uint32_t val;
>>-	struct dma_fence *f;
>>+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>>+	unsigned long flags;
>>+	uint32_t val, seq, wait_seq;
>> 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>> 	struct amdgpu_ring *ring = &kiq->ring;
>> 
>> 	BUG_ON(!ring->funcs->emit_rreg);
>> 
>>-	mutex_lock(&kiq->ring_mutex);
>>+	spin_lock_irqsave(&kiq->ring_lock, flags);
>> 	amdgpu_ring_alloc(ring, 32);
>> 	amdgpu_ring_emit_rreg(ring, reg);
>>-	amdgpu_fence_emit(ring, &f);
>>+	wait_seq = ++ring->fence_drv.sync_seq;
>>+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>>+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
>> 	amdgpu_ring_commit(ring);
>>-	mutex_unlock(&kiq->ring_mutex);
>>-
>>-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>>-	dma_fence_put(f);
>>-	if (r < 1) {
>>-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>>+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>>+	do {
>>+		seq = amdgpu_fence_read(ring);
>>+		udelay(5);
>>+		timeout -= 5;
>>+	} while (seq < wait_seq && timeout > 0);
>>+
>>+	if (timeout < 1) {
>>+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
>> 		return ~0;
>> 	}
>>-
>> 	val = adev->wb.wb[adev->virt.reg_val_offs];
>> 
>> 	return val;
>>@@ -142,24 +146,33 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
>> 
>> void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)  {
>>-	signed long r;
>>+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>>+	unsigned long flags;
>>+	uint32_t seq, wait_seq;
>> 	struct dma_fence *f;
>> 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>> 	struct amdgpu_ring *ring = &kiq->ring;
>> 
>> 	BUG_ON(!ring->funcs->emit_wreg);
>> 
>>-	mutex_lock(&kiq->ring_mutex);
>>+	spin_lock_irqsave(&kiq->ring_lock, flags);
>> 	amdgpu_ring_alloc(ring, 32);
>> 	amdgpu_ring_emit_wreg(ring, reg, v);
>>-	amdgpu_fence_emit(ring, &f);
>>+	/* amdgpu_fence_emit(ring, &f); */
>>+	wait_seq = ++ring->fence_drv.sync_seq;
>>+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>>+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
>> 	amdgpu_ring_commit(ring);
>>-	mutex_unlock(&kiq->ring_mutex);
>>+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>>+
>>+	do {
>>+		seq = amdgpu_fence_read(ring);
>>+		udelay(5);
>>+		timeout -= 5;
>>+	} while (seq < wait_seq && timeout > 0);
>> 
>>-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>>-	if (r < 1)
>>-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>>-	dma_fence_put(f);
>>+	if (timeout < 1)
>>+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
>> }
>> 
>> /**
>>--
>>2.9.5
>>
>>_______________________________________________
>>amd-gfx mailing list
>>amd-gfx@lists.freedesktop.org
>>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
       [not found]                     ` <FCE99400-0F48-4FA9-B714-D6B318718975-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-16  8:55                       ` Ding, Pixel
       [not found]                         ` <74F3DEB1-590C-4E6A-9B56-94D19EDC6F47-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Ding, Pixel @ 2017-10-16  8:55 UTC (permalink / raw)
  To: Liu, Monk; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Li, Bingley

Hi Monk,

So far I notice that it’s almost impossible to have both methods working together.

For example, if the num_fence_mask is 2, and there’re a series of sequence number like “fence seq 1, busy wait seq 2, busy wait seq 3, fence seq 4”. Then if IRQ handler for seq 1 catch the HW seq number as 2, the fence slot for seq 4 will be wrongly waken up.

If only dma_fence slots are used, the emit function ensures that each slot only has one valid seq number by waiting for the old one. But things go complicated if there’s busy waiting which registers seq number at the same time.


— 
Sincerely Yours,
Pixel







On 16/10/2017, 9:25 AM, "Ding, Pixel" <Pixel.Ding@amd.com> wrote:

>OK, I would keep both methods working together.
>— 
>Sincerely Yours,
>Pixel
>
>
>
>
>
>
>
>On 13/10/2017, 6:04 PM, "Liu, Monk" <Monk.Liu@amd.com> wrote:
>
>>Why there will be racing issue ?
>>
>>Polling or sleep wait only have different result for the caller, not the job scheduled to KIQ 
>>
>>The sleep waiting is synchroniz sleep, it just release CPU resource to other process/thread, so the order is guaranteed 
>>
>>BR Monk
>>
>>-----Original Message-----
>>From: Ding, Pixel 
>>Sent: 2017年10月13日 17:39
>>To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org
>>Cc: Li, Bingley <Bingley.Li@amd.com>
>>Subject: Re: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
>>
>>I’m afraid there’s racing issue if polling and IRQ use cases are mixed at the same time.
>>
>>The original implementation is as your suggested. Is there any benefit to keep to sleepy version?
>>— 
>>Sincerely Yours,
>>Pixel
>>
>>
>>
>>
>>
>>
>>
>>
>>On 13/10/2017, 5:34 PM, "Liu, Monk" <Monk.Liu@amd.com> wrote:
>>
>>>Pixel
>>>
>>>I don't think this will work well, my suggestion is you add a new function like:
>>>amdgpu_wreg_kiq_busy(),
>>>
>>>which will write registers through KIQ and use polling/busy wait, and the original amdgu_wreg_no_kiq() can be still there.
>>>
>>>When you need to disable sleep like in IRQ CONTEXT, you can call wreg_kiq_busy() or wreg_no_kiq(),
>>>
>>>But don't just change the original function 
>>>
>>>BR Monk
>>>
>>>-----Original Message-----
>>>From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Pixel Ding
>>>Sent: 2017年10月13日 16:26
>>>To: amd-gfx@lists.freedesktop.org
>>>Cc: Ding, Pixel <Pixel.Ding@amd.com>; Li, Bingley <Bingley.Li@amd.com>
>>>Subject: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
>>>
>>>From: pding <Pixel.Ding@amd.com>
>>>
>>>Register accessing is performed when IRQ is disabled. Never sleep in this function.
>>>
>>>Known issue: dead sleep in many use cases of index/data registers.
>>>
>>>Signed-off-by: pding <Pixel.Ding@amd.com>
>>>---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  8 ++---  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  2 +-
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    |  1 +
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  1 +
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 55 ++++++++++++++++++------------
>>> 6 files changed, 40 insertions(+), 28 deletions(-)
>>>
>>>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>index ca212ef..f9de756 100644
>>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>@@ -886,6 +886,7 @@ struct amdgpu_kiq {
>>> 	u64			eop_gpu_addr;
>>> 	struct amdgpu_bo	*eop_obj;
>>> 	struct mutex		ring_mutex;
>>>+	spinlock_t              ring_lock;
>>> 	struct amdgpu_ring	ring;
>>> 	struct amdgpu_irq_src	irq;
>>> };
>>>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>index ab8f0d6..1197274 100644
>>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>@@ -109,10 +109,8 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,  {
>>> 	uint32_t ret;
>>> 
>>>-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>>>-		BUG_ON(in_interrupt());
>>>+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
>>> 		return amdgpu_virt_kiq_rreg(adev, reg);
>>>-	}
>>> 
>>> 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
>>> 		ret = readl(((void __iomem *)adev->rmmio) + (reg * 4)); @@ -137,10 +135,8 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
>>> 		adev->last_mm_index = v;
>>> 	}
>>> 
>>>-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>>>-		BUG_ON(in_interrupt());
>>>+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
>>> 		return amdgpu_virt_kiq_wreg(adev, reg, v);
>>>-	}
>>> 
>>> 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
>>> 		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4)); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>index 2044758..c6c7bf3 100644
>>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>@@ -109,7 +109,7 @@ static void amdgpu_fence_write(struct amdgpu_ring *ring, u32 seq)
>>>  * Reads a fence value from memory (all asics).
>>>  * Returns the value of the fence read from memory.
>>>  */
>>>-static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>>>+u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>>> {
>>> 	struct amdgpu_fence_driver *drv = &ring->fence_drv;
>>> 	u32 seq = 0;
>>>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>index 4f6c68f..46fa88c 100644
>>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>@@ -186,6 +186,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
>>> 	int r = 0;
>>> 
>>> 	mutex_init(&kiq->ring_mutex);
>>>+	spin_lock_init(&kiq->ring_lock);
>>> 
>>> 	r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
>>> 	if (r)
>>>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>index af8e544..a4fa923 100644
>>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>@@ -89,6 +89,7 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,  void amdgpu_fence_driver_suspend(struct amdgpu_device *adev);  void amdgpu_fence_driver_resume(struct amdgpu_device *adev);  int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence);
>>>+u32 amdgpu_fence_read(struct amdgpu_ring *ring);
>>> void amdgpu_fence_process(struct amdgpu_ring *ring);  int amdgpu_fence_wait_empty(struct amdgpu_ring *ring);  unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>>index ab05121..9757df1 100644
>>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>>@@ -22,7 +22,7 @@
>>>  */
>>> 
>>> #include "amdgpu.h"
>>>-#define MAX_KIQ_REG_WAIT	100000
>>>+#define MAX_KIQ_REG_WAIT	100000000 /* in usecs */
>>> 
>>> int amdgpu_allocate_static_csa(struct amdgpu_device *adev)  { @@ -113,28 +113,32 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
>>> 
>>> uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)  {
>>>-	signed long r;
>>>-	uint32_t val;
>>>-	struct dma_fence *f;
>>>+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>>>+	unsigned long flags;
>>>+	uint32_t val, seq, wait_seq;
>>> 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>>> 	struct amdgpu_ring *ring = &kiq->ring;
>>> 
>>> 	BUG_ON(!ring->funcs->emit_rreg);
>>> 
>>>-	mutex_lock(&kiq->ring_mutex);
>>>+	spin_lock_irqsave(&kiq->ring_lock, flags);
>>> 	amdgpu_ring_alloc(ring, 32);
>>> 	amdgpu_ring_emit_rreg(ring, reg);
>>>-	amdgpu_fence_emit(ring, &f);
>>>+	wait_seq = ++ring->fence_drv.sync_seq;
>>>+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>>>+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
>>> 	amdgpu_ring_commit(ring);
>>>-	mutex_unlock(&kiq->ring_mutex);
>>>-
>>>-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>>>-	dma_fence_put(f);
>>>-	if (r < 1) {
>>>-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>>>+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>>>+	do {
>>>+		seq = amdgpu_fence_read(ring);
>>>+		udelay(5);
>>>+		timeout -= 5;
>>>+	} while (seq < wait_seq && timeout > 0);
>>>+
>>>+	if (timeout < 1) {
>>>+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
>>> 		return ~0;
>>> 	}
>>>-
>>> 	val = adev->wb.wb[adev->virt.reg_val_offs];
>>> 
>>> 	return val;
>>>@@ -142,24 +146,33 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
>>> 
>>> void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)  {
>>>-	signed long r;
>>>+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>>>+	unsigned long flags;
>>>+	uint32_t seq, wait_seq;
>>> 	struct dma_fence *f;
>>> 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>>> 	struct amdgpu_ring *ring = &kiq->ring;
>>> 
>>> 	BUG_ON(!ring->funcs->emit_wreg);
>>> 
>>>-	mutex_lock(&kiq->ring_mutex);
>>>+	spin_lock_irqsave(&kiq->ring_lock, flags);
>>> 	amdgpu_ring_alloc(ring, 32);
>>> 	amdgpu_ring_emit_wreg(ring, reg, v);
>>>-	amdgpu_fence_emit(ring, &f);
>>>+	/* amdgpu_fence_emit(ring, &f); */
>>>+	wait_seq = ++ring->fence_drv.sync_seq;
>>>+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>>>+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
>>> 	amdgpu_ring_commit(ring);
>>>-	mutex_unlock(&kiq->ring_mutex);
>>>+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>>>+
>>>+	do {
>>>+		seq = amdgpu_fence_read(ring);
>>>+		udelay(5);
>>>+		timeout -= 5;
>>>+	} while (seq < wait_seq && timeout > 0);
>>> 
>>>-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>>>-	if (r < 1)
>>>-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>>>-	dma_fence_put(f);
>>>+	if (timeout < 1)
>>>+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
>>> }
>>> 
>>> /**
>>>--
>>>2.9.5
>>>
>>>_______________________________________________
>>>amd-gfx mailing list
>>>amd-gfx@lists.freedesktop.org
>>>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
       [not found]                         ` <74F3DEB1-590C-4E6A-9B56-94D19EDC6F47-5C7GfCeVMHo@public.gmane.org>
@ 2017-10-16 10:04                           ` Liu, Monk
  0 siblings, 0 replies; 32+ messages in thread
From: Liu, Monk @ 2017-10-16 10:04 UTC (permalink / raw)
  To: Ding, Pixel; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Li, Bingley

alright, go for unify scheme then, use busy wait for all user.

-----Original Message-----
From: Ding, Pixel 
Sent: 2017年10月16日 16:55
To: Liu, Monk <Monk.Liu@amd.com>
Cc: Li, Bingley <Bingley.Li@amd.com>; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing

Hi Monk,

So far I notice that it’s almost impossible to have both methods working together.

For example, if the num_fence_mask is 2, and there’re a series of sequence number like “fence seq 1, busy wait seq 2, busy wait seq 3, fence seq 4”. Then if IRQ handler for seq 1 catch the HW seq number as 2, the fence slot for seq 4 will be wrongly waken up.

If only dma_fence slots are used, the emit function ensures that each slot only has one valid seq number by waiting for the old one. But things go complicated if there’s busy waiting which registers seq number at the same time.


— 
Sincerely Yours,
Pixel







On 16/10/2017, 9:25 AM, "Ding, Pixel" <Pixel.Ding@amd.com> wrote:

>OK, I would keep both methods working together.
>— 
>Sincerely Yours,
>Pixel
>
>
>
>
>
>
>
>On 13/10/2017, 6:04 PM, "Liu, Monk" <Monk.Liu@amd.com> wrote:
>
>>Why there will be racing issue ?
>>
>>Polling or sleep wait only have different result for the caller, not the job scheduled to KIQ 
>>
>>The sleep waiting is synchroniz sleep, it just release CPU resource to other process/thread, so the order is guaranteed 
>>
>>BR Monk
>>
>>-----Original Message-----
>>From: Ding, Pixel 
>>Sent: 2017年10月13日 17:39
>>To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org
>>Cc: Li, Bingley <Bingley.Li@amd.com>
>>Subject: Re: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
>>
>>I’m afraid there’s racing issue if polling and IRQ use cases are mixed at the same time.
>>
>>The original implementation is as your suggested. Is there any benefit to keep to sleepy version?
>>— 
>>Sincerely Yours,
>>Pixel
>>
>>
>>
>>
>>
>>
>>
>>
>>On 13/10/2017, 5:34 PM, "Liu, Monk" <Monk.Liu@amd.com> wrote:
>>
>>>Pixel
>>>
>>>I don't think this will work well, my suggestion is you add a new function like:
>>>amdgpu_wreg_kiq_busy(),
>>>
>>>which will write registers through KIQ and use polling/busy wait, and the original amdgu_wreg_no_kiq() can be still there.
>>>
>>>When you need to disable sleep like in IRQ CONTEXT, you can call wreg_kiq_busy() or wreg_no_kiq(),
>>>
>>>But don't just change the original function 
>>>
>>>BR Monk
>>>
>>>-----Original Message-----
>>>From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Pixel Ding
>>>Sent: 2017年10月13日 16:26
>>>To: amd-gfx@lists.freedesktop.org
>>>Cc: Ding, Pixel <Pixel.Ding@amd.com>; Li, Bingley <Bingley.Li@amd.com>
>>>Subject: [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing
>>>
>>>From: pding <Pixel.Ding@amd.com>
>>>
>>>Register accessing is performed when IRQ is disabled. Never sleep in this function.
>>>
>>>Known issue: dead sleep in many use cases of index/data registers.
>>>
>>>Signed-off-by: pding <Pixel.Ding@amd.com>
>>>---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  8 ++---  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  2 +-
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    |  1 +
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  1 +
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 55 ++++++++++++++++++------------
>>> 6 files changed, 40 insertions(+), 28 deletions(-)
>>>
>>>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>index ca212ef..f9de756 100644
>>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>@@ -886,6 +886,7 @@ struct amdgpu_kiq {
>>> 	u64			eop_gpu_addr;
>>> 	struct amdgpu_bo	*eop_obj;
>>> 	struct mutex		ring_mutex;
>>>+	spinlock_t              ring_lock;
>>> 	struct amdgpu_ring	ring;
>>> 	struct amdgpu_irq_src	irq;
>>> };
>>>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>index ab8f0d6..1197274 100644
>>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>@@ -109,10 +109,8 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,  {
>>> 	uint32_t ret;
>>> 
>>>-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>>>-		BUG_ON(in_interrupt());
>>>+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
>>> 		return amdgpu_virt_kiq_rreg(adev, reg);
>>>-	}
>>> 
>>> 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
>>> 		ret = readl(((void __iomem *)adev->rmmio) + (reg * 4)); @@ -137,10 +135,8 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
>>> 		adev->last_mm_index = v;
>>> 	}
>>> 
>>>-	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
>>>-		BUG_ON(in_interrupt());
>>>+	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev))
>>> 		return amdgpu_virt_kiq_wreg(adev, reg, v);
>>>-	}
>>> 
>>> 	if ((reg * 4) < adev->rmmio_size && !(acc_flags & AMDGPU_REGS_IDX))
>>> 		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4)); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>index 2044758..c6c7bf3 100644
>>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>@@ -109,7 +109,7 @@ static void amdgpu_fence_write(struct amdgpu_ring *ring, u32 seq)
>>>  * Reads a fence value from memory (all asics).
>>>  * Returns the value of the fence read from memory.
>>>  */
>>>-static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>>>+u32 amdgpu_fence_read(struct amdgpu_ring *ring)
>>> {
>>> 	struct amdgpu_fence_driver *drv = &ring->fence_drv;
>>> 	u32 seq = 0;
>>>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>index 4f6c68f..46fa88c 100644
>>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>@@ -186,6 +186,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
>>> 	int r = 0;
>>> 
>>> 	mutex_init(&kiq->ring_mutex);
>>>+	spin_lock_init(&kiq->ring_lock);
>>> 
>>> 	r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
>>> 	if (r)
>>>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>index af8e544..a4fa923 100644
>>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>@@ -89,6 +89,7 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,  void amdgpu_fence_driver_suspend(struct amdgpu_device *adev);  void amdgpu_fence_driver_resume(struct amdgpu_device *adev);  int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence);
>>>+u32 amdgpu_fence_read(struct amdgpu_ring *ring);
>>> void amdgpu_fence_process(struct amdgpu_ring *ring);  int amdgpu_fence_wait_empty(struct amdgpu_ring *ring);  unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>>index ab05121..9757df1 100644
>>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>>@@ -22,7 +22,7 @@
>>>  */
>>> 
>>> #include "amdgpu.h"
>>>-#define MAX_KIQ_REG_WAIT	100000
>>>+#define MAX_KIQ_REG_WAIT	100000000 /* in usecs */
>>> 
>>> int amdgpu_allocate_static_csa(struct amdgpu_device *adev)  { @@ -113,28 +113,32 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
>>> 
>>> uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)  {
>>>-	signed long r;
>>>-	uint32_t val;
>>>-	struct dma_fence *f;
>>>+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>>>+	unsigned long flags;
>>>+	uint32_t val, seq, wait_seq;
>>> 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>>> 	struct amdgpu_ring *ring = &kiq->ring;
>>> 
>>> 	BUG_ON(!ring->funcs->emit_rreg);
>>> 
>>>-	mutex_lock(&kiq->ring_mutex);
>>>+	spin_lock_irqsave(&kiq->ring_lock, flags);
>>> 	amdgpu_ring_alloc(ring, 32);
>>> 	amdgpu_ring_emit_rreg(ring, reg);
>>>-	amdgpu_fence_emit(ring, &f);
>>>+	wait_seq = ++ring->fence_drv.sync_seq;
>>>+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>>>+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
>>> 	amdgpu_ring_commit(ring);
>>>-	mutex_unlock(&kiq->ring_mutex);
>>>-
>>>-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>>>-	dma_fence_put(f);
>>>-	if (r < 1) {
>>>-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>>>+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>>>+	do {
>>>+		seq = amdgpu_fence_read(ring);
>>>+		udelay(5);
>>>+		timeout -= 5;
>>>+	} while (seq < wait_seq && timeout > 0);
>>>+
>>>+	if (timeout < 1) {
>>>+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
>>> 		return ~0;
>>> 	}
>>>-
>>> 	val = adev->wb.wb[adev->virt.reg_val_offs];
>>> 
>>> 	return val;
>>>@@ -142,24 +146,33 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
>>> 
>>> void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)  {
>>>-	signed long r;
>>>+	signed long timeout = usecs_to_jiffies(MAX_KIQ_REG_WAIT);
>>>+	unsigned long flags;
>>>+	uint32_t seq, wait_seq;
>>> 	struct dma_fence *f;
>>> 	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>>> 	struct amdgpu_ring *ring = &kiq->ring;
>>> 
>>> 	BUG_ON(!ring->funcs->emit_wreg);
>>> 
>>>-	mutex_lock(&kiq->ring_mutex);
>>>+	spin_lock_irqsave(&kiq->ring_lock, flags);
>>> 	amdgpu_ring_alloc(ring, 32);
>>> 	amdgpu_ring_emit_wreg(ring, reg, v);
>>>-	amdgpu_fence_emit(ring, &f);
>>>+	/* amdgpu_fence_emit(ring, &f); */
>>>+	wait_seq = ++ring->fence_drv.sync_seq;
>>>+	amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>>>+			       wait_seq, AMDGPU_FENCE_FLAG_INT);
>>> 	amdgpu_ring_commit(ring);
>>>-	mutex_unlock(&kiq->ring_mutex);
>>>+	spin_unlock_irqrestore(&kiq->ring_lock, flags);
>>>+
>>>+	do {
>>>+		seq = amdgpu_fence_read(ring);
>>>+		udelay(5);
>>>+		timeout -= 5;
>>>+	} while (seq < wait_seq && timeout > 0);
>>> 
>>>-	r = dma_fence_wait_timeout(f, false, msecs_to_jiffies(MAX_KIQ_REG_WAIT));
>>>-	if (r < 1)
>>>-		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>>>-	dma_fence_put(f);
>>>+	if (timeout < 1)
>>>+		DRM_ERROR("wait for kiq fence error: %ld\n", timeout);
>>> }
>>> 
>>> /**
>>>--
>>>2.9.5
>>>
>>>_______________________________________________
>>>amd-gfx mailing list
>>>amd-gfx@lists.freedesktop.org
>>>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2017-10-16 10:04 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-13  8:26 Make staging driver stable for SRIOV VF (1) Pixel Ding
     [not found] ` <1507883180-5626-1-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
2017-10-13  8:26   ` [PATCH 1/4] drm/amdgpu: always consider virualised device for checking post Pixel Ding
2017-10-13  8:26   ` [PATCH 2/4] drm/amdgpu: workaround for VM fault caused by SDMA set_wptr Pixel Ding
     [not found]     ` <1507883180-5626-3-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
2017-10-13  9:17       ` Christian König
     [not found]         ` <536fdd74-43d8-1aaf-ca7c-b3d519b8486c-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-10-13  9:21           ` Ding, Pixel
2017-10-13  8:26   ` [PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info Pixel Ding
     [not found]     ` <1507883180-5626-4-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
2017-10-13  8:34       ` Christian König
     [not found]         ` <7be3d78b-7e68-9f1e-b72f-41d1f0a504c0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-10-13  8:41           ` Ding, Pixel
     [not found]             ` <6CF4468F-BB11-4E36-91AB-756A6F4B61B3-5C7GfCeVMHo@public.gmane.org>
2017-10-13  8:51               ` Christian König
     [not found]                 ` <95665da9-4204-739c-8b69-ceec64958be0-5C7GfCeVMHo@public.gmane.org>
2017-10-13  9:03                   ` Ding, Pixel
     [not found]                     ` <CDFBCA9E-4A71-4E40-A41B-1986811903C6-5C7GfCeVMHo@public.gmane.org>
2017-10-13  9:09                       ` Ding, Pixel
2017-10-13  9:10                       ` Christian König
     [not found]                         ` <75ae098d-234b-fe3a-dfea-18de99b2aa3c-5C7GfCeVMHo@public.gmane.org>
2017-10-13  9:18                           ` Ding, Pixel
     [not found]                             ` <FE29B287-93F7-427A-8A3E-8DEE50BC888C-5C7GfCeVMHo@public.gmane.org>
2017-10-13  9:19                               ` Liu, Monk
     [not found]                                 ` <BLUPR12MB04491225CA802A960F4C752784480-7LeqcoF/hwpTIQvHjXdJlwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-10-13  9:24                                   ` Ding, Pixel
2017-10-13  9:25                               ` Liu, Monk
2017-10-13  9:26                           ` Liu, Monk
2017-10-13  9:28                           ` Liu, Monk
     [not found]                             ` <BLUPR12MB0449C09F747B3C0294345DC484480-7LeqcoF/hwpTIQvHjXdJlwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-10-13  9:36                               ` Ding, Pixel
     [not found]                                 ` <14A1E84C-7C37-4901-AAAC-7082CD0BECD2-5C7GfCeVMHo@public.gmane.org>
2017-10-13 10:14                                   ` Christian König
     [not found]                                     ` <930af7f8-9684-8dc6-cdeb-e5ffff5f811d-5C7GfCeVMHo@public.gmane.org>
2017-10-13 11:14                                       ` Liu, Monk
2017-10-13  8:26   ` [PATCH 4/4] drm/amdgpu: busywait KIQ register accessing Pixel Ding
     [not found]     ` <1507883180-5626-5-git-send-email-Pixel.Ding-5C7GfCeVMHo@public.gmane.org>
2017-10-13  9:16       ` Christian König
     [not found]         ` <d800f449-8155-34a5-b9c3-c17ba0114ca1-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-10-13  9:35           ` Ding, Pixel
     [not found]             ` <67E0D85B-6C27-4481-BFB9-6DF9E66A7E03-5C7GfCeVMHo@public.gmane.org>
2017-10-13 10:21               ` Christian König
2017-10-13  9:34       ` Liu, Monk
     [not found]         ` <BLUPR12MB0449383454665A6D4916951884480-7LeqcoF/hwpTIQvHjXdJlwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-10-13  9:38           ` Ding, Pixel
     [not found]             ` <ADFC5FB7-0904-434D-A950-A699B98B2921-5C7GfCeVMHo@public.gmane.org>
2017-10-13 10:02               ` Liu, Monk
2017-10-13 10:04               ` Liu, Monk
     [not found]                 ` <BLUPR12MB0449E6C72D64821D235E3F1184480-7LeqcoF/hwpTIQvHjXdJlwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-10-16  1:25                   ` Ding, Pixel
     [not found]                     ` <FCE99400-0F48-4FA9-B714-D6B318718975-5C7GfCeVMHo@public.gmane.org>
2017-10-16  8:55                       ` Ding, Pixel
     [not found]                         ` <74F3DEB1-590C-4E6A-9B56-94D19EDC6F47-5C7GfCeVMHo@public.gmane.org>
2017-10-16 10:04                           ` Liu, Monk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.