All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Zhang, Hawking" <Hawking.Zhang-5C7GfCeVMHo@public.gmane.org>
To: "amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
	<amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
Cc: "Deucher,
	Alexander" <Alexander.Deucher-5C7GfCeVMHo@public.gmane.org>,
	"Ma, Le" <Le.Ma-5C7GfCeVMHo@public.gmane.org>,
	"Zhou1, Tao" <Tao.Zhou1-5C7GfCeVMHo@public.gmane.org>,
	"Li, Dennis" <Dennis.Li-5C7GfCeVMHo@public.gmane.org>,
	"Chen, Guchun" <Guchun.Chen-5C7GfCeVMHo@public.gmane.org>
Subject: RE: [PATCH 01/10] drm/amdgpu: remove ras global recovery handling from ras_controller_int handler
Date: Thu, 28 Nov 2019 05:27:39 +0000	[thread overview]
Message-ID: <DM5PR12MB1418A6DF4E81A0C5582330F5FC470@DM5PR12MB1418.namprd12.prod.outlook.com> (raw)
In-Reply-To: <1574846129-4826-1-git-send-email-le.ma-5C7GfCeVMHo@public.gmane.org>

[AMD Official Use Only - Internal Distribution Only]

With the v2 version for patch #6, #7 and the fix to enable doorbell int after BACO exit in Patch #5,

The series is 

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>

Regards,
Hawking
-----Original Message-----
From: Le Ma <le.ma@amd.com> 
Sent: 2019年11月27日 17:15
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking <Hawking.Zhang@amd.com>; Chen, Guchun <Guchun.Chen@amd.com>; Zhou1, Tao <Tao.Zhou1@amd.com>; Li, Dennis <Dennis.Li@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Ma, Le <Le.Ma@amd.com>
Subject: [PATCH 01/10] drm/amdgpu: remove ras global recovery handling from ras_controller_int handler

From: Le Ma <Le.Ma@amd.com>

v2: add notification when ras controller interrupt generates

Change-Id: Ic03e42e9d1c4dab1fa7f4817c191a16e485b48a9
Signed-off-by: Le Ma <Le.Ma@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
index 0db458f..25231d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
@@ -324,7 +324,12 @@ static void nbio_v7_4_handle_ras_controller_intr_no_bifring(struct amdgpu_device
 						RAS_CNTLR_INTERRUPT_CLEAR, 1);
 		WREG32_SOC15(NBIO, 0, mmBIF_DOORBELL_INT_CNTL, bif_doorbell_intr_cntl);
 
-		amdgpu_ras_global_ras_isr(adev);
+		DRM_WARN("RAS controller interrupt triggered by NBIF error\n");
+
+		/* ras_controller_int is dedicated for nbif ras error,
+		 * not the global interrupt for sync flood
+		 */
+		amdgpu_ras_reset_gpu(adev, true);
 	}
 }
 
-- 
2.7.4
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

WARNING: multiple messages have this Message-ID (diff)
From: "Zhang, Hawking" <Hawking.Zhang@amd.com>
To: "Ma, Le" <Le.Ma@amd.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"Ma, Le" <Le.Ma@amd.com>, "Zhou1, Tao" <Tao.Zhou1@amd.com>,
	"Li, Dennis" <Dennis.Li@amd.com>,
	"Chen, Guchun" <Guchun.Chen@amd.com>
Subject: RE: [PATCH 01/10] drm/amdgpu: remove ras global recovery handling from ras_controller_int handler
Date: Thu, 28 Nov 2019 05:27:39 +0000	[thread overview]
Message-ID: <DM5PR12MB1418A6DF4E81A0C5582330F5FC470@DM5PR12MB1418.namprd12.prod.outlook.com> (raw)
Message-ID: <20191128052739.SgjOYDMnYDor3sqq5YG7NJ3KK_K7ef_s8vTEaceXszA@z> (raw)
In-Reply-To: <1574846129-4826-1-git-send-email-le.ma@amd.com>

[AMD Official Use Only - Internal Distribution Only]

With the v2 version for patch #6, #7 and the fix to enable doorbell int after BACO exit in Patch #5,

The series is 

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>

Regards,
Hawking
-----Original Message-----
From: Le Ma <le.ma@amd.com> 
Sent: 2019年11月27日 17:15
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking <Hawking.Zhang@amd.com>; Chen, Guchun <Guchun.Chen@amd.com>; Zhou1, Tao <Tao.Zhou1@amd.com>; Li, Dennis <Dennis.Li@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Ma, Le <Le.Ma@amd.com>
Subject: [PATCH 01/10] drm/amdgpu: remove ras global recovery handling from ras_controller_int handler

From: Le Ma <Le.Ma@amd.com>

v2: add notification when ras controller interrupt generates

Change-Id: Ic03e42e9d1c4dab1fa7f4817c191a16e485b48a9
Signed-off-by: Le Ma <Le.Ma@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
index 0db458f..25231d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
@@ -324,7 +324,12 @@ static void nbio_v7_4_handle_ras_controller_intr_no_bifring(struct amdgpu_device
 						RAS_CNTLR_INTERRUPT_CLEAR, 1);
 		WREG32_SOC15(NBIO, 0, mmBIF_DOORBELL_INT_CNTL, bif_doorbell_intr_cntl);
 
-		amdgpu_ras_global_ras_isr(adev);
+		DRM_WARN("RAS controller interrupt triggered by NBIF error\n");
+
+		/* ras_controller_int is dedicated for nbif ras error,
+		 * not the global interrupt for sync flood
+		 */
+		amdgpu_ras_reset_gpu(adev, true);
 	}
 }
 
-- 
2.7.4
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  parent reply	other threads:[~2019-11-28  5:27 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-27  9:15 [PATCH 01/10] drm/amdgpu: remove ras global recovery handling from ras_controller_int handler Le Ma
2019-11-27  9:15 ` Le Ma
     [not found] ` <1574846129-4826-1-git-send-email-le.ma-5C7GfCeVMHo@public.gmane.org>
2019-11-27  9:15   ` [PATCH 02/10] drm/amdgpu: export amdgpu_ras_find_obj to use externally Le Ma
2019-11-27  9:15     ` Le Ma
2019-11-27  9:15   ` [PATCH 03/10] drm/amdgpu: clear ras controller status registers when interrupt occurs Le Ma
2019-11-27  9:15     ` Le Ma
2019-11-27  9:15   ` [PATCH 05/10] drm/amdgpu: enable/disable doorbell interrupt in baco entry/exit helper Le Ma
2019-11-27  9:15     ` Le Ma
     [not found]     ` <1574846129-4826-4-git-send-email-le.ma-5C7GfCeVMHo@public.gmane.org>
2019-11-27 12:04       ` Zhang, Hawking
2019-11-27 12:04         ` Zhang, Hawking
     [not found]         ` <DM5PR12MB14184CF08E965BAF369F4249FC440-2J9CzHegvk81aAVlcVN8UQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-11-27 12:14           ` Ma, Le
2019-11-27 12:14             ` Ma, Le
2019-11-28  6:50       ` Zhou1, Tao
2019-11-28  6:50         ` Zhou1, Tao
2019-11-27  9:15   ` [PATCH 06/10] drm/amdgpu: add condition to enable baco for xgmi/ras case Le Ma
2019-11-27  9:15     ` Le Ma
     [not found]     ` <1574846129-4826-5-git-send-email-le.ma-5C7GfCeVMHo@public.gmane.org>
2019-11-27 11:28       ` Zhang, Hawking
2019-11-27 11:28         ` Zhang, Hawking
     [not found]         ` <DM5PR12MB141825CB772FEEF1FD013EDBFC440-2J9CzHegvk81aAVlcVN8UQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-11-27 12:35           ` Ma, Le
2019-11-27 12:35             ` Ma, Le
2019-11-27 11:38       ` Zhang, Hawking
2019-11-27 11:38         ` Zhang, Hawking
     [not found]         ` <DM5PR12MB1418D76FD9E6E7748C2F9997FC440-2J9CzHegvk81aAVlcVN8UQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-11-27 14:00           ` Ma, Le
2019-11-27 14:00             ` Ma, Le
2019-11-27  9:15   ` [PATCH 07/10] drm/amdgpu: add concurrent baco reset support for XGMI Le Ma
2019-11-27  9:15     ` Le Ma
     [not found]     ` <1574846129-4826-6-git-send-email-le.ma-5C7GfCeVMHo@public.gmane.org>
2019-11-27 15:46       ` Andrey Grodzovsky
2019-11-27 15:46         ` Andrey Grodzovsky
     [not found]         ` <c09d7928-f864-3a80-40e2-b6116abe044c-5C7GfCeVMHo@public.gmane.org>
2019-11-28  9:00           ` Ma, Le
2019-11-28  9:00             ` Ma, Le
2019-11-29 16:21             ` Andrey Grodzovsky
2019-12-02 11:42               ` Ma, Le
2019-12-02 22:05                 ` Andrey Grodzovsky
     [not found]                   ` <MN2PR12MB42855B198BB4064A0D311845F6420@MN2PR12MB4285.namprd12.prod.outlook.com>
     [not found]                     ` <2c4dd3f3-e2ce-9843-312b-1e5c05a51521@amd.com>
2019-12-04  7:09                       ` Ma, Le
2019-12-04 16:05                         ` Andrey Grodzovsky
2019-12-05  3:14                           ` Ma, Le
2019-12-06 21:50                             ` Andrey Grodzovsky
2019-12-09 11:34                               ` Ma, Le
2019-12-09 15:52                                 ` Andrey Grodzovsky
2019-12-10  2:45                                   ` Ma, Le
2019-12-10 19:55                                     ` Andrey Grodzovsky
2019-12-11 12:18                                       ` Ma, Le
2019-12-11 14:04                                         ` Andrey Grodzovsky
2019-12-09 22:00                                 ` Andrey Grodzovsky
2019-12-10  3:27                                   ` Ma, Le
2019-11-27  9:15   ` [PATCH 08/10] drm/amdgpu: support full gpu reset workflow when ras err_event_athub occurs Le Ma
2019-11-27  9:15     ` Le Ma
2019-11-27  9:15   ` [PATCH 09/10] drm/amdgpu: clear err_event_athub flag after reset exit Le Ma
2019-11-27  9:15     ` Le Ma
2019-11-27  9:15   ` [PATCH 10/10] drm/amdgpu: reduce redundant uvd context lost warning message Le Ma
2019-11-27  9:15     ` Le Ma
     [not found]     ` <1574846129-4826-9-git-send-email-le.ma-5C7GfCeVMHo@public.gmane.org>
2019-11-27  9:49       ` Chen, Guchun
2019-11-27  9:49         ` Chen, Guchun
     [not found]         ` <BYAPR12MB280648A1C59519AA77B3FCA9F1440-ZGDeBxoHBPk0CuAkIMgl3QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-11-27  9:54           ` Ma, Le
2019-11-27  9:54             ` Ma, Le
2019-11-28  5:27   ` Zhang, Hawking [this message]
2019-11-28  5:27     ` [PATCH 01/10] drm/amdgpu: remove ras global recovery handling from ras_controller_int handler Zhang, Hawking

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM5PR12MB1418A6DF4E81A0C5582330F5FC470@DM5PR12MB1418.namprd12.prod.outlook.com \
    --to=hawking.zhang-5c7gfcevmho@public.gmane.org \
    --cc=Alexander.Deucher-5C7GfCeVMHo@public.gmane.org \
    --cc=Dennis.Li-5C7GfCeVMHo@public.gmane.org \
    --cc=Guchun.Chen-5C7GfCeVMHo@public.gmane.org \
    --cc=Le.Ma-5C7GfCeVMHo@public.gmane.org \
    --cc=Tao.Zhou1-5C7GfCeVMHo@public.gmane.org \
    --cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.