From: YiPeng Chai <YiPeng.Chai@amd.com>
To: <amd-gfx@lists.freedesktop.org>
Cc: <yipechai@amd.com>, <Hawking.Zhang@amd.com>, <Tao.Zhou1@amd.com>,
<Candice.Li@amd.com>, <KevinYang.Wang@amd.com>,
<Stanley.Yang@amd.com>, YiPeng Chai <YiPeng.Chai@amd.com>
Subject: [PATCH 09/15] drm/amdgpu: add condition check for amdgpu_umc_fill_error_record
Date: Thu, 18 Apr 2024 10:58:30 +0800 [thread overview]
Message-ID: <20240418025836.170106-9-YiPeng.Chai@amd.com> (raw)
In-Reply-To: <20240418025836.170106-1-YiPeng.Chai@amd.com>
Add condition check for amdgpu_umc_fill_error_record.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 20 +++++++++++++++++---
drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h | 2 +-
3 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
index cb5a0f31d201..c8980d5f6540 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
@@ -579,6 +579,7 @@ struct ras_err_data {
unsigned long de_count;
unsigned long err_addr_cnt;
struct eeprom_table_record *err_addr;
+ unsigned long err_addr_len;
u32 err_list_count;
struct list_head err_node_list;
};
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
index 2bd88218c20e..dcda3d24bee3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
@@ -66,6 +66,8 @@ int amdgpu_umc_page_retirement_mca(struct amdgpu_device *adev,
goto out_fini_err_data;
}
+ err_data.err_addr_len = adev->umc.max_ras_err_cnt_per_query;
+
/*
* Translate UMC channel address to Physical address
*/
@@ -121,6 +123,8 @@ void amdgpu_umc_handle_bad_pages(struct amdgpu_device *adev,
if(!err_data->err_addr)
dev_warn(adev->dev, "Failed to alloc memory for "
"umc error address record!\n");
+ else
+ err_data->err_addr_len = adev->umc.max_ras_err_cnt_per_query;
/* umc query_ras_error_address is also responsible for clearing
* error status
@@ -146,6 +150,8 @@ void amdgpu_umc_handle_bad_pages(struct amdgpu_device *adev,
if(!err_data->err_addr)
dev_warn(adev->dev, "Failed to alloc memory for "
"umc error address record!\n");
+ else
+ err_data->err_addr_len = adev->umc.max_ras_err_cnt_per_query;
/* umc query_ras_error_address is also responsible for clearing
* error status
@@ -389,14 +395,20 @@ int amdgpu_umc_process_ecc_irq(struct amdgpu_device *adev,
return 0;
}
-void amdgpu_umc_fill_error_record(struct ras_err_data *err_data,
+int amdgpu_umc_fill_error_record(struct ras_err_data *err_data,
uint64_t err_addr,
uint64_t retired_page,
uint32_t channel_index,
uint32_t umc_inst)
{
- struct eeprom_table_record *err_rec =
- &err_data->err_addr[err_data->err_addr_cnt];
+ struct eeprom_table_record *err_rec;
+
+ if (!err_data ||
+ !err_data->err_addr ||
+ (err_data->err_addr_cnt >= err_data->err_addr_len))
+ return -EINVAL;
+
+ err_rec = &err_data->err_addr[err_data->err_addr_cnt];
err_rec->address = err_addr;
/* page frame address is saved */
@@ -408,6 +420,8 @@ void amdgpu_umc_fill_error_record(struct ras_err_data *err_data,
err_rec->mcumc_id = umc_inst;
err_data->err_addr_cnt++;
+
+ return 0;
}
int amdgpu_umc_loop_channels(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h
index 2d08d076f7c9..9e77e6d48e3b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h
@@ -109,7 +109,7 @@ int amdgpu_umc_poison_handler(struct amdgpu_device *adev,
int amdgpu_umc_process_ecc_irq(struct amdgpu_device *adev,
struct amdgpu_irq_src *source,
struct amdgpu_iv_entry *entry);
-void amdgpu_umc_fill_error_record(struct ras_err_data *err_data,
+int amdgpu_umc_fill_error_record(struct ras_err_data *err_data,
uint64_t err_addr,
uint64_t retired_page,
uint32_t channel_index,
--
2.34.1
next prev parent reply other threads:[~2024-04-18 3:02 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-18 2:58 [PATCH 01/15] drm/amdgpu: Add interface to reserve bad page YiPeng Chai
2024-04-18 2:58 ` [PATCH 02/15] drm/amdgpu: add message fifo to handle RAS poison events YiPeng Chai
2024-04-18 2:58 ` [PATCH 03/15] drm/amdgpu: prepare for logging ecc errors YiPeng Chai
2024-04-18 2:58 ` [PATCH 04/15] drm/amdgpu: add poison creation handler YiPeng Chai
2024-04-25 2:32 ` Zhang, Hawking
2024-04-25 3:35 ` Chai, Thomas
2024-04-18 2:58 ` [PATCH 05/15] drm/amdgpu: add interface to update umc v12_0 ecc status YiPeng Chai
2024-04-18 2:58 ` [PATCH 06/15] drm/amdgpu: umc v12_0 converts error address YiPeng Chai
2024-04-25 3:02 ` Zhang, Hawking
2024-04-25 3:31 ` Chai, Thomas
2024-04-18 2:58 ` [PATCH 07/15] drm/amdgpu: umc v12_0 logs ecc errors YiPeng Chai
2024-04-18 2:58 ` [PATCH 08/15] drm/amdgpu: Add delay work to retire bad pages YiPeng Chai
2024-04-18 2:58 ` YiPeng Chai [this message]
2024-04-18 2:58 ` [PATCH 10/15] drm/amdgpu: retire bad pages for umc v12_0 YiPeng Chai
2024-04-22 8:14 ` Zhou1, Tao
2024-04-22 9:21 ` Chai, Thomas
2024-04-22 9:33 ` Chai, Thomas
2024-04-18 2:58 ` [PATCH 11/15] drm/amdgpu: prepare to handle pasid poison consumption YiPeng Chai
2024-04-25 3:00 ` Zhang, Hawking
2024-04-25 3:17 ` Chai, Thomas
2024-04-18 2:58 ` [PATCH 12/15] drm/amdgpu: add poison consumption handler YiPeng Chai
2024-04-18 2:58 ` [PATCH 13/15] drm/amdgpu: support ACA logging ecc errors YiPeng Chai
2024-04-18 2:58 ` [PATCH 14/15] drm/amdgpu: Fix address translation defect YiPeng Chai
2024-04-18 2:58 ` [PATCH 15/15] drm/amdgpu: Use new interface to reserve bad page YiPeng Chai
2024-04-18 9:00 ` Christian König
2024-04-18 9:34 ` Chai, Thomas
2024-04-22 8:27 ` Zhou1, Tao
2024-04-22 2:25 ` Chai, Thomas
2024-04-22 7:06 ` [PATCH 01/15] drm/amdgpu: Add " Christian König
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240418025836.170106-9-YiPeng.Chai@amd.com \
--to=yipeng.chai@amd.com \
--cc=Candice.Li@amd.com \
--cc=Hawking.Zhang@amd.com \
--cc=KevinYang.Wang@amd.com \
--cc=Stanley.Yang@amd.com \
--cc=Tao.Zhou1@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=yipechai@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).