All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/40] I2C fixes
@ 2021-06-08 21:39 Luben Tuikov
  2021-06-08 21:39 ` [PATCH 01/40] drm/amdgpu: add a mutex for the smu11 i2c bus (v2) Luben Tuikov
                   ` (39 more replies)
  0 siblings, 40 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Xinhui Pan, Guchun Chen, Lijo Lazar,
	Luben Tuikov, Stanley Yang, Alexander Deucher, John Clements,
	Jean Delvare, Hawking Zhang

I2C fixes from various people. Some RAS touch-ups too.

A rebased tree can also be found here: 
https://gitlab.freedesktop.org/ltuikov/linux/-/commits/i2c-rework-luben

Aaron Rice (1):
  drm/amdgpu: rework smu11 i2c for generic operation

Alex Deucher (10):
  drm/amdgpu: add a mutex for the smu11 i2c bus (v2)
  drm/amdgpu/pm: rework i2c xfers on sienna cichlid (v3)
  drm/amdgpu/pm: rework i2c xfers on arcturus (v3)
  drm/amdgpu/pm: add smu i2c implementation for navi1x (v3)
  drm/amdgpu: add new helper for handling EEPROM i2c transfers
  drm/amdgpu/ras: switch ras eeprom handling to use generic helper
  drm/amdgpu/ras: switch fru eeprom handling to use generic helper (v2)
  drm/amdgpu: i2c subsystem uses 7 bit addresses
  drm/amdgpu: add I2C_CLASS_HWMON to SMU i2c buses
  drm/amdgpu: only set restart on first cmd of the smu i2c transaction

Andrey Grodzovsky (6):
  drm/amdgpu: Remember to wait 10ms for write buffer flush v2
  dmr/amdgpu: Add RESTART handling also to smu_v11_0_i2c (VG20)
  drm/amdgpu: Drop i > 0 restriction for issuing RESTART
  drm/amdgpu: Send STOP for the last byte of msg only
  drm/amd/pm: SMU I2C: Return number of messages processed
  drm/amdgpu/pm: ADD I2C quirk adapter table

Luben Tuikov (23):
  drm/amdgpu: Fix Vega20 I2C to be agnostic (v2)
  drm/amdgpu: Fixes to the AMDGPU EEPROM driver
  drm/amdgpu: EEPROM respects I2C quirks
  drm/amdgpu: I2C EEPROM full memory addressing
  drm/amdgpu: RAS and FRU now use 19-bit I2C address
  drm/amdgpu: Fix wrap-around bugs in RAS
  drm/amdgpu: I2C class is HWMON
  drm/amdgpu: RAS: EEPROM --> RAS
  drm/amdgpu: Rename misspelled function
  drm/amdgpu: RAS xfer to read/write
  drm/amdgpu: EEPROM: add explicit read and write
  drm/amd/pm: Extend the I2C quirk table
  drm/amd/pm: Simplify managed I2C transfer functions
  drm/amdgpu: Fix width of I2C address
  drm/amdgpu: Return result fix in RAS
  drm/amd/pm: Fix a bug in i2c_xfer
  drm/amdgpu: Fix amdgpu_ras_eeprom_init()
  drm/amdgpu: Simplify RAS EEPROM checksum calculations
  drm/amdgpu: Use explicit cardinality for clarity
  drm/amdgpu: Optimizations to EEPROM RAS table I/O
  drm/amdgpu: RAS EEPROM table is now in debugfs
  drm/amdgpu: Fix koops when accessing RAS EEPROM
  drm/amdgpu: Use a single loop

 drivers/gpu/drm/amd/amdgpu/Makefile           |    3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |    9 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c    |  239 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h    |   37 +
 .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c    |   32 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |  114 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h       |    1 +
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 1253 +++++++++++------
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |   68 +-
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c    |  239 ++--
 drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h       |    1 +
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |  238 +---
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |  118 ++
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |  241 +---
 14 files changed, 1620 insertions(+), 973 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Jean Delvare <jdelvare@suse.de>
Cc: John Clements <john.clements@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Xinhui Pan <xinhui.pan@amd.com>

base-commit: 84c09f365aba25d3d1d7b36791987ce088294de0
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 01/40] drm/amdgpu: add a mutex for the smu11 i2c bus (v2)
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-08 21:39 ` [PATCH 02/40] drm/amdgpu/pm: rework i2c xfers on sienna cichlid (v3) Luben Tuikov
                   ` (38 subsequent siblings)
  39 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Luben Tuikov

From: Alex Deucher <alexander.deucher@amd.com>

So we lock software as well as hardware access to the bus.

v2: fix mutex handling.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c | 19 +++++++++----------
 drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h    |  1 +
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
index 5c7d769aee3fba..1d8f6d5180e099 100644
--- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
@@ -584,12 +584,11 @@ static void lock_bus(struct i2c_adapter *i2c, unsigned int flags)
 {
 	struct amdgpu_device *adev = to_amdgpu_device(i2c);
 
-	if (!smu_v11_0_i2c_bus_lock(i2c)) {
+	mutex_lock(&adev->pm.smu_i2c_mutex);
+	if (!smu_v11_0_i2c_bus_lock(i2c))
 		DRM_ERROR("Failed to lock the bus from SMU");
-		return;
-	}
-
-	adev->pm.bus_locked = true;
+	else
+		adev->pm.bus_locked = true;
 }
 
 static int trylock_bus(struct i2c_adapter *i2c, unsigned int flags)
@@ -602,12 +601,11 @@ static void unlock_bus(struct i2c_adapter *i2c, unsigned int flags)
 {
 	struct amdgpu_device *adev = to_amdgpu_device(i2c);
 
-	if (!smu_v11_0_i2c_bus_unlock(i2c)) {
+	if (!smu_v11_0_i2c_bus_unlock(i2c))
 		DRM_ERROR("Failed to unlock the bus from SMU");
-		return;
-	}
-
-	adev->pm.bus_locked = false;
+	else
+		adev->pm.bus_locked = false;
+	mutex_unlock(&adev->pm.smu_i2c_mutex);
 }
 
 static const struct i2c_lock_operations smu_v11_0_i2c_i2c_lock_ops = {
@@ -665,6 +663,7 @@ int smu_v11_0_i2c_control_init(struct i2c_adapter *control)
 	struct amdgpu_device *adev = to_amdgpu_device(control);
 	int res;
 
+	mutex_init(&adev->pm.smu_i2c_mutex);
 	control->owner = THIS_MODULE;
 	control->class = I2C_CLASS_SPD;
 	control->dev.parent = &adev->pdev->dev;
diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
index f6e0e7d8a00771..d03e6fa2bf1adf 100644
--- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
+++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
@@ -450,6 +450,7 @@ struct amdgpu_pm {
 
 	/* Used for I2C access to various EEPROMs on relevant ASICs */
 	struct i2c_adapter smu_i2c;
+	struct mutex		smu_i2c_mutex;
 	struct list_head	pm_attr_list;
 };
 
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 02/40] drm/amdgpu/pm: rework i2c xfers on sienna cichlid (v3)
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
  2021-06-08 21:39 ` [PATCH 01/40] drm/amdgpu: add a mutex for the smu11 i2c bus (v2) Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-08 21:39 ` [PATCH 03/40] drm/amdgpu/pm: rework i2c xfers on arcturus (v3) Luben Tuikov
                   ` (37 subsequent siblings)
  39 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Luben Tuikov

From: Alex Deucher <alexander.deucher@amd.com>

Make it generic so we can support more than just EEPROMs.

v2: fix restart handling between transactions.
v3: handle 7 to 8 bit addr conversion

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 229 +++++-------------
 1 file changed, 58 insertions(+), 171 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index f01e919e1f8988..499e1309d0f796 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -3389,197 +3389,84 @@ static void sienna_cichlid_dump_pptable(struct smu_context *smu)
 	dev_info(smu->adev->dev, "MmHubPadding[7] = 0x%x\n", pptable->MmHubPadding[7]);
 }
 
-static void sienna_cichlid_fill_i2c_req(SwI2cRequest_t  *req, bool write,
-				  uint8_t address, uint32_t numbytes,
-				  uint8_t *data)
-{
-	int i;
-
-	req->I2CcontrollerPort = 1;
-	req->I2CSpeed = 2;
-	req->SlaveAddress = address;
-	req->NumCmds = numbytes;
-
-	for (i = 0; i < numbytes; i++) {
-		SwI2cCmd_t *cmd =  &req->SwI2cCmds[i];
-
-		/* First 2 bytes are always write for lower 2b EEPROM address */
-		if (i < 2)
-			cmd->CmdConfig = CMDCONFIG_READWRITE_MASK;
-		else
-			cmd->CmdConfig = write ? CMDCONFIG_READWRITE_MASK : 0;
-
-
-		/* Add RESTART for read  after address filled */
-		cmd->CmdConfig |= (i == 2 && !write) ? CMDCONFIG_RESTART_MASK : 0;
-
-		/* Add STOP in the end */
-		cmd->CmdConfig |= (i == (numbytes - 1)) ? CMDCONFIG_STOP_MASK : 0;
-
-		/* Fill with data regardless if read or write to simplify code */
-		cmd->ReadWriteData = data[i];
-	}
-}
-
-static int sienna_cichlid_i2c_read_data(struct i2c_adapter *control,
-					       uint8_t address,
-					       uint8_t *data,
-					       uint32_t numbytes)
+static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
+				   struct i2c_msg *msgs, int num)
 {
-	uint32_t  i, ret = 0;
-	SwI2cRequest_t req;
-	struct amdgpu_device *adev = to_amdgpu_device(control);
+	struct amdgpu_device *adev = to_amdgpu_device(i2c_adap);
 	struct smu_table_context *smu_table = &adev->smu.smu_table;
 	struct smu_table *table = &smu_table->driver_table;
-
-	if (numbytes > MAX_SW_I2C_COMMANDS) {
-		dev_err(adev->dev, "numbytes requested %d is over max allowed %d\n",
-			numbytes, MAX_SW_I2C_COMMANDS);
-		return -EINVAL;
+	SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
+	u16 bytes_to_transfer, remaining_bytes, msg_bytes;
+	u16 available_bytes = MAX_SW_I2C_COMMANDS;
+	int i, j, r, c;
+	u8 slave;
+
+	/* only support a single slave addr per transaction */
+	slave = msgs[0].addr;
+	for (i = 0; i < num; i++) {
+		if (slave != msgs[i].addr)
+			return -EINVAL;
+		bytes_to_transfer += min(msgs[i].len, available_bytes);
+		available_bytes -= bytes_to_transfer;
 	}
 
-	memset(&req, 0, sizeof(req));
-	sienna_cichlid_fill_i2c_req(&req, false, address, numbytes, data);
-
-	mutex_lock(&adev->smu.mutex);
-	/* Now read data starting with that address */
-	ret = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req,
-					true);
-	mutex_unlock(&adev->smu.mutex);
-
-	if (!ret) {
-		SwI2cRequest_t *res = (SwI2cRequest_t *)table->cpu_addr;
-
-		/* Assume SMU  fills res.SwI2cCmds[i].Data with read bytes */
-		for (i = 0; i < numbytes; i++)
-			data[i] = res->SwI2cCmds[i].ReadWriteData;
-
-		dev_dbg(adev->dev, "sienna_cichlid_i2c_read_data, address = %x, bytes = %d, data :",
-				  (uint16_t)address, numbytes);
+	req = kzalloc(sizeof(*req), GFP_KERNEL);
+	if (!req)
+		return -ENOMEM;
 
-		print_hex_dump(KERN_DEBUG, "data: ", DUMP_PREFIX_NONE,
-			       8, 1, data, numbytes, false);
-	} else
-		dev_err(adev->dev, "sienna_cichlid_i2c_read_data - error occurred :%x", ret);
+	req->I2CcontrollerPort = 1;
+	req->I2CSpeed = I2C_SPEED_FAST_400K;
+	req->SlaveAddress = slave << 1; /* 8 bit addresses */
+	req->NumCmds = bytes_to_transfer;
 
-	return ret;
-}
+	remaining_bytes = bytes_to_transfer;
+	c = 0;
+	for (i = 0; i < num; i++) {
+		struct i2c_msg *msg = &msgs[i];
 
-static int sienna_cichlid_i2c_write_data(struct i2c_adapter *control,
-						uint8_t address,
-						uint8_t *data,
-						uint32_t numbytes)
-{
-	uint32_t ret;
-	SwI2cRequest_t req;
-	struct amdgpu_device *adev = to_amdgpu_device(control);
+		msg_bytes = min(msg->len, remaining_bytes);
+		for (j = 0; j < msg_bytes; j++) {
+			SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
 
-	if (numbytes > MAX_SW_I2C_COMMANDS) {
-		dev_err(adev->dev, "numbytes requested %d is over max allowed %d\n",
-			numbytes, MAX_SW_I2C_COMMANDS);
-		return -EINVAL;
+			remaining_bytes--;
+			if (!(msg[i].flags & I2C_M_RD)) {
+				/* write */
+				cmd->CmdConfig |= CMDCONFIG_READWRITE_MASK;
+				cmd->ReadWriteData = msg->buf[j];
+			}
+			if ((msg[i].flags & I2C_M_STOP) ||
+			    (!remaining_bytes))
+				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
+			if ((i > 0) && !(msg[i].flags & I2C_M_NOSTART))
+				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
+		}
 	}
-
-	memset(&req, 0, sizeof(req));
-	sienna_cichlid_fill_i2c_req(&req, true, address, numbytes, data);
-
 	mutex_lock(&adev->smu.mutex);
-	ret = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
+	r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
 	mutex_unlock(&adev->smu.mutex);
+	if (r)
+		goto fail;
 
-	if (!ret) {
-		dev_dbg(adev->dev, "sienna_cichlid_i2c_write(), address = %x, bytes = %d , data: ",
-					 (uint16_t)address, numbytes);
-
-		print_hex_dump(KERN_DEBUG, "data: ", DUMP_PREFIX_NONE,
-			       8, 1, data, numbytes, false);
-		/*
-		 * According to EEPROM spec there is a MAX of 10 ms required for
-		 * EEPROM to flush internal RX buffer after STOP was issued at the
-		 * end of write transaction. During this time the EEPROM will not be
-		 * responsive to any more commands - so wait a bit more.
-		 */
-		msleep(10);
-
-	} else
-		dev_err(adev->dev, "sienna_cichlid_i2c_write- error occurred :%x", ret);
-
-	return ret;
-}
-
-static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
-			      struct i2c_msg *msgs, int num)
-{
-	uint32_t  i, j, ret, data_size, data_chunk_size, next_eeprom_addr = 0;
-	uint8_t *data_ptr, data_chunk[MAX_SW_I2C_COMMANDS] = { 0 };
-
+	remaining_bytes = bytes_to_transfer;
+	c = 0;
 	for (i = 0; i < num; i++) {
-		/*
-		 * SMU interface allows at most MAX_SW_I2C_COMMANDS bytes of data at
-		 * once and hence the data needs to be spliced into chunks and sent each
-		 * chunk separately
-		 */
-		data_size = msgs[i].len - 2;
-		data_chunk_size = MAX_SW_I2C_COMMANDS - 2;
-		next_eeprom_addr = (msgs[i].buf[0] << 8 & 0xff00) | (msgs[i].buf[1] & 0xff);
-		data_ptr = msgs[i].buf + 2;
-
-		for (j = 0; j < data_size / data_chunk_size; j++) {
-			/* Insert the EEPROM dest addess, bits 0-15 */
-			data_chunk[0] = ((next_eeprom_addr >> 8) & 0xff);
-			data_chunk[1] = (next_eeprom_addr & 0xff);
-
-			if (msgs[i].flags & I2C_M_RD) {
-				ret = sienna_cichlid_i2c_read_data(i2c_adap,
-							     (uint8_t)msgs[i].addr,
-							     data_chunk, MAX_SW_I2C_COMMANDS);
-
-				memcpy(data_ptr, data_chunk + 2, data_chunk_size);
-			} else {
-
-				memcpy(data_chunk + 2, data_ptr, data_chunk_size);
-
-				ret = sienna_cichlid_i2c_write_data(i2c_adap,
-							      (uint8_t)msgs[i].addr,
-							      data_chunk, MAX_SW_I2C_COMMANDS);
-			}
-
-			if (ret) {
-				num = -EIO;
-				goto fail;
-			}
-
-			next_eeprom_addr += data_chunk_size;
-			data_ptr += data_chunk_size;
-		}
-
-		if (data_size % data_chunk_size) {
-			data_chunk[0] = ((next_eeprom_addr >> 8) & 0xff);
-			data_chunk[1] = (next_eeprom_addr & 0xff);
-
-			if (msgs[i].flags & I2C_M_RD) {
-				ret = sienna_cichlid_i2c_read_data(i2c_adap,
-							     (uint8_t)msgs[i].addr,
-							     data_chunk, (data_size % data_chunk_size) + 2);
-
-				memcpy(data_ptr, data_chunk + 2, data_size % data_chunk_size);
-			} else {
-				memcpy(data_chunk + 2, data_ptr, data_size % data_chunk_size);
+		struct i2c_msg *msg = &msgs[i];
 
-				ret = sienna_cichlid_i2c_write_data(i2c_adap,
-							      (uint8_t)msgs[i].addr,
-							      data_chunk, (data_size % data_chunk_size) + 2);
-			}
+		msg_bytes = min(msg->len, remaining_bytes);
+		for (j = 0; j < msg_bytes; j++) {
+			SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
 
-			if (ret) {
-				num = -EIO;
-				goto fail;
-			}
+			remaining_bytes--;
+			if (msg[i].flags & I2C_M_RD)
+				msg->buf[j] = cmd->ReadWriteData;
 		}
 	}
+	r = bytes_to_transfer;
 
 fail:
-	return num;
+	kfree(req);
+
+	return r;
 }
 
 static u32 sienna_cichlid_i2c_func(struct i2c_adapter *adap)
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 03/40] drm/amdgpu/pm: rework i2c xfers on arcturus (v3)
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
  2021-06-08 21:39 ` [PATCH 01/40] drm/amdgpu: add a mutex for the smu11 i2c bus (v2) Luben Tuikov
  2021-06-08 21:39 ` [PATCH 02/40] drm/amdgpu/pm: rework i2c xfers on sienna cichlid (v3) Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-08 21:39 ` [PATCH 04/40] drm/amdgpu/pm: add smu i2c implementation for navi1x (v3) Luben Tuikov
                   ` (36 subsequent siblings)
  39 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Luben Tuikov

From: Alex Deucher <alexander.deucher@amd.com>

Make it generic so we can support more than just EEPROMs.

v2: fix restart handling between transactions.
v3: handle 7 to 8 bit addr conversion

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 229 +++++-------------
 1 file changed, 58 insertions(+), 171 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index 5959019f51ad66..e1f7607302ba6c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -1906,197 +1906,84 @@ static int arcturus_dpm_set_vcn_enable(struct smu_context *smu, bool enable)
 	return ret;
 }
 
-static void arcturus_fill_i2c_req(SwI2cRequest_t  *req, bool write,
-				  uint8_t address, uint32_t numbytes,
-				  uint8_t *data)
-{
-	int i;
-
-	req->I2CcontrollerPort = 0;
-	req->I2CSpeed = 2;
-	req->SlaveAddress = address;
-	req->NumCmds = numbytes;
-
-	for (i = 0; i < numbytes; i++) {
-		SwI2cCmd_t *cmd =  &req->SwI2cCmds[i];
-
-		/* First 2 bytes are always write for lower 2b EEPROM address */
-		if (i < 2)
-			cmd->Cmd = 1;
-		else
-			cmd->Cmd = write;
-
-
-		/* Add RESTART for read  after address filled */
-		cmd->CmdConfig |= (i == 2 && !write) ? CMDCONFIG_RESTART_MASK : 0;
-
-		/* Add STOP in the end */
-		cmd->CmdConfig |= (i == (numbytes - 1)) ? CMDCONFIG_STOP_MASK : 0;
-
-		/* Fill with data regardless if read or write to simplify code */
-		cmd->RegisterAddr = data[i];
-	}
-}
-
-static int arcturus_i2c_read_data(struct i2c_adapter *control,
-					       uint8_t address,
-					       uint8_t *data,
-					       uint32_t numbytes)
+static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
+			     struct i2c_msg *msgs, int num)
 {
-	uint32_t  i, ret = 0;
-	SwI2cRequest_t req;
-	struct amdgpu_device *adev = to_amdgpu_device(control);
+	struct amdgpu_device *adev = to_amdgpu_device(i2c_adap);
 	struct smu_table_context *smu_table = &adev->smu.smu_table;
 	struct smu_table *table = &smu_table->driver_table;
-
-	if (numbytes > MAX_SW_I2C_COMMANDS) {
-		dev_err(adev->dev, "numbytes requested %d is over max allowed %d\n",
-			numbytes, MAX_SW_I2C_COMMANDS);
-		return -EINVAL;
+	SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
+	u16 bytes_to_transfer, remaining_bytes, msg_bytes;
+	u16 available_bytes = MAX_SW_I2C_COMMANDS;
+	int i, j, r, c;
+	u8 slave;
+
+	/* only support a single slave addr per transaction */
+	slave = msgs[0].addr;
+	for (i = 0; i < num; i++) {
+		if (slave != msgs[i].addr)
+			return -EINVAL;
+		bytes_to_transfer += min(msgs[i].len, available_bytes);
+		available_bytes -= bytes_to_transfer;
 	}
 
-	memset(&req, 0, sizeof(req));
-	arcturus_fill_i2c_req(&req, false, address, numbytes, data);
-
-	mutex_lock(&adev->smu.mutex);
-	/* Now read data starting with that address */
-	ret = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req,
-					true);
-	mutex_unlock(&adev->smu.mutex);
-
-	if (!ret) {
-		SwI2cRequest_t *res = (SwI2cRequest_t *)table->cpu_addr;
-
-		/* Assume SMU  fills res.SwI2cCmds[i].Data with read bytes */
-		for (i = 0; i < numbytes; i++)
-			data[i] = res->SwI2cCmds[i].Data;
-
-		dev_dbg(adev->dev, "arcturus_i2c_read_data, address = %x, bytes = %d, data :",
-				  (uint16_t)address, numbytes);
+	req = kzalloc(sizeof(*req), GFP_KERNEL);
+	if (!req)
+		return -ENOMEM;
 
-		print_hex_dump(KERN_DEBUG, "data: ", DUMP_PREFIX_NONE,
-			       8, 1, data, numbytes, false);
-	} else
-		dev_err(adev->dev, "arcturus_i2c_read_data - error occurred :%x", ret);
+	req->I2CcontrollerPort = 1;
+	req->I2CSpeed = I2C_SPEED_FAST_400K;
+	req->SlaveAddress = slave << 1; /* 8 bit addresses */
+	req->NumCmds = bytes_to_transfer;
 
-	return ret;
-}
+	remaining_bytes = bytes_to_transfer;
+	c = 0;
+	for (i = 0; i < num; i++) {
+		struct i2c_msg *msg = &msgs[i];
 
-static int arcturus_i2c_write_data(struct i2c_adapter *control,
-						uint8_t address,
-						uint8_t *data,
-						uint32_t numbytes)
-{
-	uint32_t ret;
-	SwI2cRequest_t req;
-	struct amdgpu_device *adev = to_amdgpu_device(control);
+		msg_bytes = min(msg->len, remaining_bytes);
+		for (j = 0; j < msg_bytes; j++) {
+			SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
 
-	if (numbytes > MAX_SW_I2C_COMMANDS) {
-		dev_err(adev->dev, "numbytes requested %d is over max allowed %d\n",
-			numbytes, MAX_SW_I2C_COMMANDS);
-		return -EINVAL;
+			remaining_bytes--;
+			if (!(msg[i].flags & I2C_M_RD)) {
+				/* write */
+				cmd->CmdConfig |= I2C_CMD_WRITE;
+				cmd->RegisterAddr = msg->buf[j];
+			}
+			if ((msg[i].flags & I2C_M_STOP) ||
+			    (!remaining_bytes))
+				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
+			if ((i > 0) && !(msg[i].flags & I2C_M_NOSTART))
+				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
+		}
 	}
-
-	memset(&req, 0, sizeof(req));
-	arcturus_fill_i2c_req(&req, true, address, numbytes, data);
-
 	mutex_lock(&adev->smu.mutex);
-	ret = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
+	r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
 	mutex_unlock(&adev->smu.mutex);
+	if (r)
+		goto fail;
 
-	if (!ret) {
-		dev_dbg(adev->dev, "arcturus_i2c_write(), address = %x, bytes = %d , data: ",
-					 (uint16_t)address, numbytes);
-
-		print_hex_dump(KERN_DEBUG, "data: ", DUMP_PREFIX_NONE,
-			       8, 1, data, numbytes, false);
-		/*
-		 * According to EEPROM spec there is a MAX of 10 ms required for
-		 * EEPROM to flush internal RX buffer after STOP was issued at the
-		 * end of write transaction. During this time the EEPROM will not be
-		 * responsive to any more commands - so wait a bit more.
-		 */
-		msleep(10);
-
-	} else
-		dev_err(adev->dev, "arcturus_i2c_write- error occurred :%x", ret);
-
-	return ret;
-}
-
-static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
-			      struct i2c_msg *msgs, int num)
-{
-	uint32_t  i, j, ret, data_size, data_chunk_size, next_eeprom_addr = 0;
-	uint8_t *data_ptr, data_chunk[MAX_SW_I2C_COMMANDS] = { 0 };
-
+	remaining_bytes = bytes_to_transfer;
+	c = 0;
 	for (i = 0; i < num; i++) {
-		/*
-		 * SMU interface allows at most MAX_SW_I2C_COMMANDS bytes of data at
-		 * once and hence the data needs to be spliced into chunks and sent each
-		 * chunk separately
-		 */
-		data_size = msgs[i].len - 2;
-		data_chunk_size = MAX_SW_I2C_COMMANDS - 2;
-		next_eeprom_addr = (msgs[i].buf[0] << 8 & 0xff00) | (msgs[i].buf[1] & 0xff);
-		data_ptr = msgs[i].buf + 2;
-
-		for (j = 0; j < data_size / data_chunk_size; j++) {
-			/* Insert the EEPROM dest addess, bits 0-15 */
-			data_chunk[0] = ((next_eeprom_addr >> 8) & 0xff);
-			data_chunk[1] = (next_eeprom_addr & 0xff);
+		struct i2c_msg *msg = &msgs[i];
 
-			if (msgs[i].flags & I2C_M_RD) {
-				ret = arcturus_i2c_read_data(i2c_adap,
-							     (uint8_t)msgs[i].addr,
-							     data_chunk, MAX_SW_I2C_COMMANDS);
+		msg_bytes = min(msg->len, remaining_bytes);
+		for (j = 0; j < msg_bytes; j++) {
+			SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
 
-				memcpy(data_ptr, data_chunk + 2, data_chunk_size);
-			} else {
-
-				memcpy(data_chunk + 2, data_ptr, data_chunk_size);
-
-				ret = arcturus_i2c_write_data(i2c_adap,
-							      (uint8_t)msgs[i].addr,
-							      data_chunk, MAX_SW_I2C_COMMANDS);
-			}
-
-			if (ret) {
-				num = -EIO;
-				goto fail;
-			}
-
-			next_eeprom_addr += data_chunk_size;
-			data_ptr += data_chunk_size;
-		}
-
-		if (data_size % data_chunk_size) {
-			data_chunk[0] = ((next_eeprom_addr >> 8) & 0xff);
-			data_chunk[1] = (next_eeprom_addr & 0xff);
-
-			if (msgs[i].flags & I2C_M_RD) {
-				ret = arcturus_i2c_read_data(i2c_adap,
-							     (uint8_t)msgs[i].addr,
-							     data_chunk, (data_size % data_chunk_size) + 2);
-
-				memcpy(data_ptr, data_chunk + 2, data_size % data_chunk_size);
-			} else {
-				memcpy(data_chunk + 2, data_ptr, data_size % data_chunk_size);
-
-				ret = arcturus_i2c_write_data(i2c_adap,
-							      (uint8_t)msgs[i].addr,
-							      data_chunk, (data_size % data_chunk_size) + 2);
-			}
-
-			if (ret) {
-				num = -EIO;
-				goto fail;
-			}
+			remaining_bytes--;
+			if (msg[i].flags & I2C_M_RD)
+				msg->buf[j] = cmd->Data;
 		}
 	}
+	r = bytes_to_transfer;
 
 fail:
-	return num;
+	kfree(req);
+
+	return r;
 }
 
 static u32 arcturus_i2c_func(struct i2c_adapter *adap)
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 04/40] drm/amdgpu/pm: add smu i2c implementation for navi1x (v3)
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (2 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 03/40] drm/amdgpu/pm: rework i2c xfers on arcturus (v3) Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-08 21:39 ` [PATCH 05/40] drm/amdgpu: add new helper for handling EEPROM i2c transfers Luben Tuikov
                   ` (35 subsequent siblings)
  39 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Luben Tuikov

From: Alex Deucher <alexander.deucher@amd.com>

And handle more than just EEPROMs.

v2: fix restart handling between transactions.
v3: handle 7 to 8 bit addr conversion

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 116 ++++++++++++++++++
 1 file changed, 116 insertions(+)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 74a8c676e22cfb..6bb8b4d631a254 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2701,6 +2701,120 @@ static ssize_t navi10_get_legacy_gpu_metrics(struct smu_context *smu,
 	return sizeof(struct gpu_metrics_v1_3);
 }
 
+static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
+			   struct i2c_msg *msgs, int num)
+{
+	struct amdgpu_device *adev = to_amdgpu_device(i2c_adap);
+	struct smu_table_context *smu_table = &adev->smu.smu_table;
+	struct smu_table *table = &smu_table->driver_table;
+	SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
+	u16 bytes_to_transfer, remaining_bytes, msg_bytes;
+	u16 available_bytes = MAX_SW_I2C_COMMANDS;
+	int i, j, r, c;
+	u8 slave;
+
+	/* only support a single slave addr per transaction */
+	slave = msgs[0].addr;
+	for (i = 0; i < num; i++) {
+		if (slave != msgs[i].addr)
+			return -EINVAL;
+		bytes_to_transfer += min(msgs[i].len, available_bytes);
+		available_bytes -= bytes_to_transfer;
+	}
+
+	req = kzalloc(sizeof(*req), GFP_KERNEL);
+	if (!req)
+		return -ENOMEM;
+
+	req->I2CcontrollerPort = 1;
+	req->I2CSpeed = I2C_SPEED_FAST_400K;
+	req->SlaveAddress = slave << 1; /* 8 bit addresses */
+	req->NumCmds = bytes_to_transfer;
+
+	remaining_bytes = bytes_to_transfer;
+	c = 0;
+	for (i = 0; i < num; i++) {
+		struct i2c_msg *msg = &msgs[i];
+
+		msg_bytes = min(msg->len, remaining_bytes);
+		for (j = 0; j < msg_bytes; j++) {
+			SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
+
+			remaining_bytes--;
+			if (!(msg[i].flags & I2C_M_RD)) {
+				/* write */
+				cmd->CmdConfig |= I2C_CMD_WRITE;
+				cmd->RegisterAddr = msg->buf[j];
+			}
+			if ((msg[i].flags & I2C_M_STOP) ||
+			    (!remaining_bytes))
+				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
+			if ((i > 0) && !(msg[i].flags & I2C_M_NOSTART))
+				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
+		}
+	}
+	mutex_lock(&adev->smu.mutex);
+	r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
+	mutex_unlock(&adev->smu.mutex);
+	if (r)
+		goto fail;
+
+	remaining_bytes = bytes_to_transfer;
+	c = 0;
+	for (i = 0; i < num; i++) {
+		struct i2c_msg *msg = &msgs[i];
+
+		msg_bytes = min(msg->len, remaining_bytes);
+		for (j = 0; j < msg_bytes; j++) {
+			SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
+
+			remaining_bytes--;
+			if (msg[i].flags & I2C_M_RD)
+				msg->buf[j] = cmd->Data;
+		}
+	}
+	r = bytes_to_transfer;
+
+fail:
+	kfree(req);
+
+	return r;
+}
+
+static u32 navi10_i2c_func(struct i2c_adapter *adap)
+{
+	return I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL;
+}
+
+
+static const struct i2c_algorithm navi10_i2c_algo = {
+	.master_xfer = navi10_i2c_xfer,
+	.functionality = navi10_i2c_func,
+};
+
+static int navi10_i2c_control_init(struct smu_context *smu, struct i2c_adapter *control)
+{
+	struct amdgpu_device *adev = to_amdgpu_device(control);
+	int res;
+
+	control->owner = THIS_MODULE;
+	control->class = I2C_CLASS_SPD;
+	control->dev.parent = &adev->pdev->dev;
+	control->algo = &navi10_i2c_algo;
+	snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
+
+	res = i2c_add_adapter(control);
+	if (res)
+		DRM_ERROR("Failed to register hw i2c, err: %d\n", res);
+
+	return res;
+}
+
+static void navi10_i2c_control_fini(struct smu_context *smu, struct i2c_adapter *control)
+{
+	i2c_del_adapter(control);
+}
+
 static ssize_t navi10_get_gpu_metrics(struct smu_context *smu,
 				      void **table)
 {
@@ -3035,6 +3149,8 @@ static const struct pptable_funcs navi10_ppt_funcs = {
 	.set_default_dpm_table = navi10_set_default_dpm_table,
 	.dpm_set_vcn_enable = navi10_dpm_set_vcn_enable,
 	.dpm_set_jpeg_enable = navi10_dpm_set_jpeg_enable,
+	.i2c_init = navi10_i2c_control_init,
+	.i2c_fini = navi10_i2c_control_fini,
 	.print_clk_levels = navi10_print_clk_levels,
 	.force_clk_levels = navi10_force_clk_levels,
 	.populate_umd_state_clk = navi10_populate_umd_state_clk,
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 05/40] drm/amdgpu: add new helper for handling EEPROM i2c transfers
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (3 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 04/40] drm/amdgpu/pm: add smu i2c implementation for navi1x (v3) Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-08 21:39 ` [PATCH 06/40] drm/amdgpu/ras: switch ras eeprom handling to use generic helper Luben Tuikov
                   ` (34 subsequent siblings)
  39 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Luben Tuikov

From: Alex Deucher <alexander.deucher@amd.com>

Encapsulates the i2c protocol handling so other parts of the
driver can just tell it the offset and size of data to write.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/Makefile        |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c | 67 ++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h | 34 +++++++++++
 3 files changed, 103 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile
index c56320e78c0e1f..7d292485ca7cf2 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -57,7 +57,8 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
 	amdgpu_xgmi.o amdgpu_csa.o amdgpu_ras.o amdgpu_vm_cpu.o \
 	amdgpu_vm_sdma.o amdgpu_discovery.o amdgpu_ras_eeprom.o amdgpu_nbio.o \
 	amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \
-	amdgpu_fw_attestation.o amdgpu_securedisplay.o amdgpu_hdp.o
+	amdgpu_fw_attestation.o amdgpu_securedisplay.o amdgpu_hdp.o \
+	amdgpu_eeprom.o
 
 amdgpu-$(CONFIG_PROC_FS) += amdgpu_fdinfo.o
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
new file mode 100644
index 00000000000000..10551660343278
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
@@ -0,0 +1,67 @@
+/*
+ * Copyright 2021 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include "amdgpu_eeprom.h"
+#include "amdgpu.h"
+
+#define EEPROM_OFFSET_LENGTH 2
+
+int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
+		       u16 slave_addr, u16 eeprom_addr,
+		       u8 *eeprom_buf, u16 bytes, bool read)
+{
+	u8 eeprom_offset_buf[2];
+	u16 bytes_transferred;
+	struct i2c_msg msgs[] = {
+		{
+			.addr = slave_addr,
+			.flags = 0,
+			.len = EEPROM_OFFSET_LENGTH,
+			.buf = eeprom_offset_buf,
+		},
+		{
+			.addr = slave_addr,
+			.flags = read ? I2C_M_RD: 0,
+			.len = bytes,
+			.buf = eeprom_buf,
+		}
+	};
+	int r;
+
+	msgs[0].buf[0] = ((eeprom_addr >> 8) & 0xff);
+	msgs[0].buf[1] = (eeprom_addr & 0xff);
+
+	while (msgs[1].len) {
+		r = i2c_transfer(i2c_adap, msgs, ARRAY_SIZE(msgs));
+		if (r <= 0)
+			return r;
+		bytes_transferred = r - EEPROM_OFFSET_LENGTH;
+		eeprom_addr += bytes_transferred;
+		msgs[0].buf[0] = ((eeprom_addr >> 8) & 0xff);
+		msgs[0].buf[1] = (eeprom_addr & 0xff);
+		msgs[1].buf += bytes_transferred;
+		msgs[1].len -= bytes_transferred;
+	}
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
new file mode 100644
index 00000000000000..9301e5678910ad
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
@@ -0,0 +1,34 @@
+/*
+ * Copyright 2021 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef _AMDGPU_EEPROM_H
+#define _AMDGPU_EEPROM_H
+
+#include <linux/i2c.h>
+
+int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
+		       u16 slave_addr, u16 eeprom_addr,
+		       u8 *eeprom_buf, u16 bytes, bool read);
+
+
+#endif
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 06/40] drm/amdgpu/ras: switch ras eeprom handling to use generic helper
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (4 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 05/40] drm/amdgpu: add new helper for handling EEPROM i2c transfers Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-08 21:39 ` [PATCH 07/40] drm/amdgpu/ras: switch fru eeprom handling to use generic helper (v2) Luben Tuikov
                   ` (33 subsequent siblings)
  39 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Luben Tuikov

From: Alex Deucher <alexander.deucher@amd.com>

Use the new helper rather than doing i2c transfers directly.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 86 ++++++-------------
 1 file changed, 28 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index f40c871da0c623..e22a0b45f70108 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -26,6 +26,7 @@
 #include "amdgpu_ras.h"
 #include <linux/bits.h>
 #include "atom.h"
+#include "amdgpu_eeprom.h"
 
 #define EEPROM_I2C_TARGET_ADDR_VEGA20		0xA0
 #define EEPROM_I2C_TARGET_ADDR_ARCTURUS		0xA8
@@ -148,22 +149,13 @@ static int __update_table_header(struct amdgpu_ras_eeprom_control *control,
 {
 	int ret = 0;
 	struct amdgpu_device *adev = to_amdgpu_device(control);
-	struct i2c_msg msg = {
-			.addr	= 0,
-			.flags	= 0,
-			.len	= EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE,
-			.buf	= buff,
-	};
 
-
-	*(uint16_t *)buff = EEPROM_HDR_START;
-	__encode_table_header_to_buff(&control->tbl_hdr, buff + EEPROM_ADDRESS_SIZE);
-
-	msg.addr = control->i2c_address;
+	__encode_table_header_to_buff(&control->tbl_hdr, buff);
 
 	/* i2c may be unstable in gpu reset */
 	down_read(&adev->reset_sem);
-	ret = i2c_transfer(&adev->pm.smu_i2c, &msg, 1);
+	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, control->i2c_address,
+				 EEPROM_HDR_START, buff, EEPROM_TABLE_HEADER_SIZE, false);
 	up_read(&adev->reset_sem);
 
 	if (ret < 1)
@@ -289,15 +281,9 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
 {
 	int ret = 0;
 	struct amdgpu_device *adev = to_amdgpu_device(control);
-	unsigned char buff[EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE] = { 0 };
+	unsigned char buff[EEPROM_TABLE_HEADER_SIZE] = { 0 };
 	struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr;
 	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
-	struct i2c_msg msg = {
-			.addr	= 0,
-			.flags	= I2C_M_RD,
-			.len	= EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE,
-			.buf	= buff,
-	};
 
 	*exceed_err_limit = false;
 
@@ -313,9 +299,9 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
 
 	mutex_init(&control->tbl_mutex);
 
-	msg.addr = control->i2c_address;
 	/* Read/Create table header from EEPROM address 0 */
-	ret = i2c_transfer(&adev->pm.smu_i2c, &msg, 1);
+	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, control->i2c_address,
+				 EEPROM_HDR_START, buff, EEPROM_TABLE_HEADER_SIZE, true);
 	if (ret < 1) {
 		DRM_ERROR("Failed to read EEPROM table header, ret:%d", ret);
 		return ret;
@@ -442,6 +428,7 @@ static uint32_t __correct_eeprom_dest_address(uint32_t curr_address)
 
 bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev)
 {
+
 	struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
 
 	if (!__is_ras_eeprom_supported(adev))
@@ -470,11 +457,11 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 					    int num)
 {
 	int i, ret = 0;
-	struct i2c_msg *msgs, *msg;
 	unsigned char *buffs, *buff;
 	struct eeprom_table_record *record;
 	struct amdgpu_device *adev = to_amdgpu_device(control);
 	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
+	u16 slave_addr;
 
 	if (!__is_ras_eeprom_supported(adev))
 		return 0;
@@ -486,12 +473,6 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 
 	mutex_lock(&control->tbl_mutex);
 
-	msgs = kcalloc(num, sizeof(*msgs), GFP_KERNEL);
-	if (!msgs) {
-		ret = -ENOMEM;
-		goto free_buff;
-	}
-
 	/*
 	 * If saved bad pages number exceeds the bad page threshold for
 	 * the whole VRAM, update table header to mark the BAD GPU tag
@@ -521,9 +502,8 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 	 * 256b
 	 */
 	for (i = 0; i < num; i++) {
-		buff = &buffs[i * (EEPROM_ADDRESS_SIZE + EEPROM_TABLE_RECORD_SIZE)];
+		buff = &buffs[i * EEPROM_TABLE_RECORD_SIZE];
 		record = &records[i];
-		msg = &msgs[i];
 
 		control->next_addr = __correct_eeprom_dest_address(control->next_addr);
 
@@ -531,20 +511,26 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 		 * Update bits 16,17 of EEPROM address in I2C address by setting them
 		 * to bits 1,2 of Device address byte
 		 */
-		msg->addr = control->i2c_address |
-			        ((control->next_addr & EEPROM_ADDR_MSB_MASK) >> 15);
-		msg->flags	= write ? 0 : I2C_M_RD;
-		msg->len	= EEPROM_ADDRESS_SIZE + EEPROM_TABLE_RECORD_SIZE;
-		msg->buf	= buff;
-
-		/* Insert the EEPROM dest addess, bits 0-15 */
-		buff[0] = ((control->next_addr >> 8) & 0xff);
-		buff[1] = (control->next_addr & 0xff);
+		slave_addr = control->i2c_address |
+			((control->next_addr & EEPROM_ADDR_MSB_MASK) >> 15);
 
 		/* EEPROM table content is stored in LE format */
 		if (write)
-			__encode_table_record_to_buff(control, record, buff + EEPROM_ADDRESS_SIZE);
+			__encode_table_record_to_buff(control, record, buff);
+
+		/* i2c may be unstable in gpu reset */
+		down_read(&adev->reset_sem);
+		ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, slave_addr,
+					 control->next_addr, buff,
+					 EEPROM_TABLE_RECORD_SIZE, write ? false : true);
+		up_read(&adev->reset_sem);
 
+		if (ret < 1) {
+			DRM_ERROR("Failed to process EEPROM table records, ret:%d", ret);
+
+			/* TODO Restore prev next EEPROM address ? */
+			goto free_buff;
+		}
 		/*
 		 * The destination EEPROM address might need to be corrected to account
 		 * for page or entire memory wrapping
@@ -552,25 +538,12 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 		control->next_addr += EEPROM_TABLE_RECORD_SIZE;
 	}
 
-	/* i2c may be unstable in gpu reset */
-	down_read(&adev->reset_sem);
-	ret = i2c_transfer(&adev->pm.smu_i2c, msgs, num);
-	up_read(&adev->reset_sem);
-
-	if (ret < 1) {
-		DRM_ERROR("Failed to process EEPROM table records, ret:%d", ret);
-
-		/* TODO Restore prev next EEPROM address ? */
-		goto free_msgs;
-	}
-
-
 	if (!write) {
 		for (i = 0; i < num; i++) {
-			buff = &buffs[i*(EEPROM_ADDRESS_SIZE + EEPROM_TABLE_RECORD_SIZE)];
+			buff = &buffs[i*EEPROM_TABLE_RECORD_SIZE];
 			record = &records[i];
 
-			__decode_table_record_from_buff(control, record, buff + EEPROM_ADDRESS_SIZE);
+			__decode_table_record_from_buff(control, record, buff);
 		}
 	}
 
@@ -600,9 +573,6 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 		/* ret = -EIO; */
 	}
 
-free_msgs:
-	kfree(msgs);
-
 free_buff:
 	kfree(buffs);
 
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 07/40] drm/amdgpu/ras: switch fru eeprom handling to use generic helper (v2)
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (5 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 06/40] drm/amdgpu/ras: switch ras eeprom handling to use generic helper Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-08 21:39 ` [PATCH 08/40] drm/amdgpu: i2c subsystem uses 7 bit addresses Luben Tuikov
                   ` (32 subsequent siblings)
  39 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Luben Tuikov

From: Alex Deucher <alexander.deucher@amd.com>

Use the new helper rather than doing i2c transfers directly.

v2: fix typo

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c    | 22 +++++--------------
 1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 39b6c6bfab4533..224da573ba1b59 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -27,9 +27,9 @@
 #include "smu_v11_0_i2c.h"
 #include "atom.h"
 #include "amdgpu_fru_eeprom.h"
+#include "amdgpu_eeprom.h"
 
 #define I2C_PRODUCT_INFO_ADDR		0xAC
-#define I2C_PRODUCT_INFO_ADDR_SIZE	0x2
 #define I2C_PRODUCT_INFO_OFFSET		0xC0
 
 static bool is_fru_eeprom_supported(struct amdgpu_device *adev)
@@ -65,16 +65,9 @@ static int amdgpu_fru_read_eeprom(struct amdgpu_device *adev, uint32_t addrptr,
 			   unsigned char *buff)
 {
 	int ret, size;
-	struct i2c_msg msg = {
-			.addr   = I2C_PRODUCT_INFO_ADDR,
-			.flags  = I2C_M_RD,
-			.buf    = buff,
-	};
-	buff[0] = 0;
-	buff[1] = addrptr;
-	msg.len = I2C_PRODUCT_INFO_ADDR_SIZE + 1;
-	ret = i2c_transfer(&adev->pm.smu_i2c, &msg, 1);
 
+	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, I2C_PRODUCT_INFO_ADDR,
+				 addrptr, buff, 1, true);
 	if (ret < 1) {
 		DRM_WARN("FRU: Failed to get size field");
 		return ret;
@@ -83,13 +76,10 @@ static int amdgpu_fru_read_eeprom(struct amdgpu_device *adev, uint32_t addrptr,
 	/* The size returned by the i2c requires subtraction of 0xC0 since the
 	 * size apparently always reports as 0xC0+actual size.
 	 */
-	size = buff[2] - I2C_PRODUCT_INFO_OFFSET;
-	/* Add 1 since address field was 1 byte */
-	buff[1] = addrptr + 1;
-
-	msg.len = I2C_PRODUCT_INFO_ADDR_SIZE + size;
-	ret = i2c_transfer(&adev->pm.smu_i2c, &msg, 1);
+	size = buff[0] - I2C_PRODUCT_INFO_OFFSET;
 
+	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, I2C_PRODUCT_INFO_ADDR,
+				 addrptr + 1, buff, size, true);
 	if (ret < 1) {
 		DRM_WARN("FRU: Failed to get data field");
 		return ret;
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 08/40] drm/amdgpu: i2c subsystem uses 7 bit addresses
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (6 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 07/40] drm/amdgpu/ras: switch fru eeprom handling to use generic helper (v2) Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-08 21:39 ` [PATCH 09/40] drm/amdgpu: add I2C_CLASS_HWMON to SMU i2c buses Luben Tuikov
                   ` (31 subsequent siblings)
  39 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Luben Tuikov

From: Alex Deucher <alexander.deucher@amd.com>

Convert from 8 bit to 7 bit.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 10 +++++-----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 224da573ba1b59..2b854bc6ae34bb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -29,7 +29,7 @@
 #include "amdgpu_fru_eeprom.h"
 #include "amdgpu_eeprom.h"
 
-#define I2C_PRODUCT_INFO_ADDR		0xAC
+#define I2C_PRODUCT_INFO_ADDR		0x56
 #define I2C_PRODUCT_INFO_OFFSET		0xC0
 
 static bool is_fru_eeprom_supported(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index e22a0b45f70108..2b981e96ce5b9e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -28,11 +28,11 @@
 #include "atom.h"
 #include "amdgpu_eeprom.h"
 
-#define EEPROM_I2C_TARGET_ADDR_VEGA20		0xA0
-#define EEPROM_I2C_TARGET_ADDR_ARCTURUS		0xA8
-#define EEPROM_I2C_TARGET_ADDR_ARCTURUS_D342	0xA0
-#define EEPROM_I2C_TARGET_ADDR_SIENNA_CICHLID   0xA0
-#define EEPROM_I2C_TARGET_ADDR_ALDEBARAN        0xA0
+#define EEPROM_I2C_TARGET_ADDR_VEGA20		0x50
+#define EEPROM_I2C_TARGET_ADDR_ARCTURUS		0x54
+#define EEPROM_I2C_TARGET_ADDR_ARCTURUS_D342	0x50
+#define EEPROM_I2C_TARGET_ADDR_SIENNA_CICHLID   0x50
+#define EEPROM_I2C_TARGET_ADDR_ALDEBARAN        0x50	       
 
 /*
  * The 2 macros bellow represent the actual size in bytes that
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 09/40] drm/amdgpu: add I2C_CLASS_HWMON to SMU i2c buses
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (7 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 08/40] drm/amdgpu: i2c subsystem uses 7 bit addresses Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-08 21:39 ` [PATCH 10/40] drm/amdgpu: rework smu11 i2c for generic operation Luben Tuikov
                   ` (30 subsequent siblings)
  39 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Luben Tuikov

From: Alex Deucher <alexander.deucher@amd.com>

Not sure that this really matters that much, but these could
have various other hwmon chips on them.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c              | 2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
index 1d8f6d5180e099..3a164d93c90293 100644
--- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
@@ -665,7 +665,7 @@ int smu_v11_0_i2c_control_init(struct i2c_adapter *control)
 
 	mutex_init(&adev->pm.smu_i2c_mutex);
 	control->owner = THIS_MODULE;
-	control->class = I2C_CLASS_SPD;
+	control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
 	control->dev.parent = &adev->pdev->dev;
 	control->algo = &smu_v11_0_i2c_algo;
 	snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index e1f7607302ba6c..a8249ee354572c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -2003,7 +2003,7 @@ static int arcturus_i2c_control_init(struct smu_context *smu, struct i2c_adapter
 	int res;
 
 	control->owner = THIS_MODULE;
-	control->class = I2C_CLASS_SPD;
+	control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
 	control->dev.parent = &adev->pdev->dev;
 	control->algo = &arcturus_i2c_algo;
 	snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 6bb8b4d631a254..644b4821220ede 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2798,7 +2798,7 @@ static int navi10_i2c_control_init(struct smu_context *smu, struct i2c_adapter *
 	int res;
 
 	control->owner = THIS_MODULE;
-	control->class = I2C_CLASS_SPD;
+	control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
 	control->dev.parent = &adev->pdev->dev;
 	control->algo = &navi10_i2c_algo;
 	snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 499e1309d0f796..10eb7d6f48fcac 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -3486,7 +3486,7 @@ static int sienna_cichlid_i2c_control_init(struct smu_context *smu, struct i2c_a
 	int res;
 
 	control->owner = THIS_MODULE;
-	control->class = I2C_CLASS_SPD;
+	control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
 	control->dev.parent = &adev->pdev->dev;
 	control->algo = &sienna_cichlid_i2c_algo;
 	snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 10/40] drm/amdgpu: rework smu11 i2c for generic operation
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (8 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 09/40] drm/amdgpu: add I2C_CLASS_HWMON to SMU i2c buses Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-08 21:39 ` [PATCH 11/40] drm/amdgpu: only set restart on first cmd of the smu i2c transaction Luben Tuikov
                   ` (29 subsequent siblings)
  39 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Luben Tuikov, Aaron Rice

From: Aaron Rice <wolf@lovehindpa.ws>

Handle things besides EEPROMS.

Signed-off-by: Aaron Rice <wolf@lovehindpa.ws>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c | 47 +++++-----------------
 1 file changed, 9 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
index 3a164d93c90293..3193d566f4f87e 100644
--- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
@@ -117,8 +117,7 @@ static void smu_v11_0_i2c_set_address(struct i2c_adapter *control, uint8_t addre
 {
 	struct amdgpu_device *adev = to_amdgpu_device(control);
 
-	/* Convert fromr 8-bit to 7-bit address */
-	address >>= 1;
+	/* We take 7-bit addresses raw */
 	WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_TAR, (address & 0xFF));
 }
 
@@ -531,22 +530,14 @@ static bool smu_v11_0_i2c_bus_unlock(struct i2c_adapter *control)
 /***************************** I2C GLUE ****************************/
 
 static uint32_t smu_v11_0_i2c_read_data(struct i2c_adapter *control,
-					uint8_t address,
-					uint8_t *data,
-					uint32_t numbytes)
+					struct i2c_msg *msg)
 {
 	uint32_t  ret = 0;
 
-	/* First 2 bytes are dummy write to set EEPROM address */
-	ret = smu_v11_0_i2c_transmit(control, address, data, 2, I2C_NO_STOP);
-	if (ret != I2C_OK)
-		goto Fail;
-
 	/* Now read data starting with that address */
-	ret = smu_v11_0_i2c_receive(control, address, data + 2, numbytes - 2,
+	ret = smu_v11_0_i2c_receive(control, msg->addr, msg->buf, msg->len,
 				    I2C_RESTART);
 
-Fail:
 	if (ret != I2C_OK)
 		DRM_ERROR("ReadData() - I2C error occurred :%x", ret);
 
@@ -554,28 +545,16 @@ static uint32_t smu_v11_0_i2c_read_data(struct i2c_adapter *control,
 }
 
 static uint32_t smu_v11_0_i2c_write_data(struct i2c_adapter *control,
-					 uint8_t address,
-					 uint8_t *data,
-					 uint32_t numbytes)
+					struct i2c_msg *msg)
 {
 	uint32_t  ret;
 
-	ret = smu_v11_0_i2c_transmit(control, address, data, numbytes, 0);
+	/* Send I2C_NO_STOP unless requested to stop. */
+	ret = smu_v11_0_i2c_transmit(control, msg->addr, msg->buf, msg->len, ((msg->flags & I2C_M_STOP) ? 0 : I2C_NO_STOP));
 
 	if (ret != I2C_OK)
 		DRM_ERROR("WriteI2CData() - I2C error occurred :%x", ret);
-	else
-		/*
-		 * According to EEPROM spec there is a MAX of 10 ms required for
-		 * EEPROM to flush internal RX buffer after STOP was issued at the
-		 * end of write transaction. During this time the EEPROM will not be
-		 * responsive to any more commands - so wait a bit more.
-		 *
-		 * TODO Improve to wait for first ACK for slave address after
-		 * internal write cycle done.
-		 */
-		msleep(10);
-
+	
 	return ret;
 
 }
@@ -618,24 +597,16 @@ static int smu_v11_0_i2c_xfer(struct i2c_adapter *i2c_adap,
 			      struct i2c_msg *msgs, int num)
 {
 	int i, ret;
-	struct amdgpu_device *adev = to_amdgpu_device(i2c_adap);
-
-	if (!adev->pm.bus_locked) {
-		DRM_ERROR("I2C bus unlocked, stopping transaction!");
-		return -EIO;
-	}
 
 	smu_v11_0_i2c_init(i2c_adap);
 
 	for (i = 0; i < num; i++) {
 		if (msgs[i].flags & I2C_M_RD)
 			ret = smu_v11_0_i2c_read_data(i2c_adap,
-						      (uint8_t)msgs[i].addr,
-						      msgs[i].buf, msgs[i].len);
+						      msgs + i);
 		else
 			ret = smu_v11_0_i2c_write_data(i2c_adap,
-						       (uint8_t)msgs[i].addr,
-						       msgs[i].buf, msgs[i].len);
+						       msgs + i);
 
 		if (ret != I2C_OK) {
 			num = -EIO;
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 11/40] drm/amdgpu: only set restart on first cmd of the smu i2c transaction
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (9 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 10/40] drm/amdgpu: rework smu11 i2c for generic operation Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-08 21:39 ` [PATCH 12/40] drm/amdgpu: Remember to wait 10ms for write buffer flush v2 Luben Tuikov
                   ` (28 subsequent siblings)
  39 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Luben Tuikov

From: Alex Deucher <alexander.deucher@amd.com>

Not sure how the firmware interprets these.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index a8249ee354572c..73e261260b76e6 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -1954,7 +1954,7 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
 			if ((msg[i].flags & I2C_M_STOP) ||
 			    (!remaining_bytes))
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
-			if ((i > 0) && !(msg[i].flags & I2C_M_NOSTART))
+			if ((i > 0) && (j == 0) && !(msg[i].flags & I2C_M_NOSTART))
 				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
 		}
 	}
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 644b4821220ede..5dc48e557c2bad 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2749,7 +2749,7 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
 			if ((msg[i].flags & I2C_M_STOP) ||
 			    (!remaining_bytes))
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
-			if ((i > 0) && !(msg[i].flags & I2C_M_NOSTART))
+			if ((i > 0) && (j == 0) && !(msg[i].flags & I2C_M_NOSTART))
 				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
 		}
 	}
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 10eb7d6f48fcac..fdbc54622dbfbf 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -3437,7 +3437,7 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
 			if ((msg[i].flags & I2C_M_STOP) ||
 			    (!remaining_bytes))
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
-			if ((i > 0) && !(msg[i].flags & I2C_M_NOSTART))
+			if ((i > 0) && (j == 0) && !(msg[i].flags & I2C_M_NOSTART))
 				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
 		}
 	}
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 12/40] drm/amdgpu: Remember to wait 10ms for write buffer flush v2
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (10 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 11/40] drm/amdgpu: only set restart on first cmd of the smu i2c transaction Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-08 21:39 ` [PATCH 13/40] dmr/amdgpu: Add RESTART handling also to smu_v11_0_i2c (VG20) Luben Tuikov
                   ` (27 subsequent siblings)
  39 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Andrey Grodzovsky, Luben Tuikov

From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

EEPROM spec requests this.

v2: Only to be done for write data transactions.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
index 10551660343278..fe0e9b0c4d5a38 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
@@ -55,6 +55,21 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
 		r = i2c_transfer(i2c_adap, msgs, ARRAY_SIZE(msgs));
 		if (r <= 0)
 			return r;
+
+		/* Only for write data */
+		if (!msgs[1].flags)
+			/*
+			 * According to EEPROM spec there is a MAX of 10 ms required for
+			 * EEPROM to flush internal RX buffer after STOP was issued at the
+			 * end of write transaction. During this time the EEPROM will not be
+			 * responsive to any more commands - so wait a bit more.
+			 *
+			 * TODO Improve to wait for first ACK for slave address after
+			 * internal write cycle done.
+			 */
+			msleep(10);
+
+
 		bytes_transferred = r - EEPROM_OFFSET_LENGTH;
 		eeprom_addr += bytes_transferred;
 		msgs[0].buf[0] = ((eeprom_addr >> 8) & 0xff);
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 13/40] dmr/amdgpu: Add RESTART handling also to smu_v11_0_i2c (VG20)
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (11 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 12/40] drm/amdgpu: Remember to wait 10ms for write buffer flush v2 Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 20:18   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 14/40] drm/amdgpu: Drop i > 0 restriction for issuing RESTART Luben Tuikov
                   ` (26 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Also generilize the code to accept and translate to
HW bits any I2C relvent flags both for read and write.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
index 3193d566f4f87e..5a90d9351b22eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
@@ -530,13 +530,11 @@ static bool smu_v11_0_i2c_bus_unlock(struct i2c_adapter *control)
 /***************************** I2C GLUE ****************************/
 
 static uint32_t smu_v11_0_i2c_read_data(struct i2c_adapter *control,
-					struct i2c_msg *msg)
+					struct i2c_msg *msg, uint32_t i2c_flag)
 {
-	uint32_t  ret = 0;
+	uint32_t  ret;
 
-	/* Now read data starting with that address */
-	ret = smu_v11_0_i2c_receive(control, msg->addr, msg->buf, msg->len,
-				    I2C_RESTART);
+	ret = smu_v11_0_i2c_receive(control, msg->addr, msg->buf, msg->len, i2c_flag);
 
 	if (ret != I2C_OK)
 		DRM_ERROR("ReadData() - I2C error occurred :%x", ret);
@@ -545,12 +543,11 @@ static uint32_t smu_v11_0_i2c_read_data(struct i2c_adapter *control,
 }
 
 static uint32_t smu_v11_0_i2c_write_data(struct i2c_adapter *control,
-					struct i2c_msg *msg)
+					struct i2c_msg *msg, uint32_t i2c_flag)
 {
 	uint32_t  ret;
 
-	/* Send I2C_NO_STOP unless requested to stop. */
-	ret = smu_v11_0_i2c_transmit(control, msg->addr, msg->buf, msg->len, ((msg->flags & I2C_M_STOP) ? 0 : I2C_NO_STOP));
+	ret = smu_v11_0_i2c_transmit(control, msg->addr, msg->buf, msg->len, i2c_flag);
 
 	if (ret != I2C_OK)
 		DRM_ERROR("WriteI2CData() - I2C error occurred :%x", ret);
@@ -601,12 +598,17 @@ static int smu_v11_0_i2c_xfer(struct i2c_adapter *i2c_adap,
 	smu_v11_0_i2c_init(i2c_adap);
 
 	for (i = 0; i < num; i++) {
+		uint32_t i2c_flag = ((msgs[i].flags & I2C_M_NOSTART) ? 0 : I2C_RESTART) ||
+				    (((msgs[i].flags & I2C_M_STOP) ? 0 : I2C_NO_STOP));
+
 		if (msgs[i].flags & I2C_M_RD)
 			ret = smu_v11_0_i2c_read_data(i2c_adap,
-						      msgs + i);
+						      msgs + i,
+						      i2c_flag);
 		else
 			ret = smu_v11_0_i2c_write_data(i2c_adap,
-						       msgs + i);
+						       msgs + i,
+						       i2c_flag);
 
 		if (ret != I2C_OK) {
 			num = -EIO;
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 14/40] drm/amdgpu: Drop i > 0 restriction for issuing RESTART
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (12 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 13/40] dmr/amdgpu: Add RESTART handling also to smu_v11_0_i2c (VG20) Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 20:21   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 15/40] drm/amdgpu: Send STOP for the last byte of msg only Luben Tuikov
                   ` (25 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index 73e261260b76e6..72b02025b07e06 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -1954,7 +1954,7 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
 			if ((msg[i].flags & I2C_M_STOP) ||
 			    (!remaining_bytes))
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
-			if ((i > 0) && (j == 0) && !(msg[i].flags & I2C_M_NOSTART))
+			if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
 				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
 		}
 	}
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 5dc48e557c2bad..289d09a5d711b9 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2749,7 +2749,7 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
 			if ((msg[i].flags & I2C_M_STOP) ||
 			    (!remaining_bytes))
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
-			if ((i > 0) && (j == 0) && !(msg[i].flags & I2C_M_NOSTART))
+			if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
 				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
 		}
 	}
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index fdbc54622dbfbf..e8e57462ce9d64 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -3437,7 +3437,7 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
 			if ((msg[i].flags & I2C_M_STOP) ||
 			    (!remaining_bytes))
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
-			if ((i > 0) && (j == 0) && !(msg[i].flags & I2C_M_NOSTART))
+			if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
 				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
 		}
 	}
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 15/40] drm/amdgpu: Send STOP for the last byte of msg only
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (13 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 14/40] drm/amdgpu: Drop i > 0 restriction for issuing RESTART Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 20:22   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 16/40] drm/amd/pm: SMU I2C: Return number of messages processed Luben Tuikov
                   ` (24 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Let's just ignore the I2C_M_STOP hint from upper
layer for SMU I2C code as there is no clean
mapping between single per I2C message STOP flag
at the kernel I2C layer and the SMU, per each byte
STOP flag. We will just by default set it at the
end of the SMU I2C message.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Lazar Lijo <Lijo.Lazar@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 4 ++--
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 4 ++--
 drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index 72b02025b07e06..235e83e9f0feb7 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -1951,9 +1951,9 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
 				cmd->CmdConfig |= I2C_CMD_WRITE;
 				cmd->RegisterAddr = msg->buf[j];
 			}
-			if ((msg[i].flags & I2C_M_STOP) ||
-			    (!remaining_bytes))
+			if (!remaining_bytes)
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
+
 			if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
 				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
 		}
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 289d09a5d711b9..b94c5a1d3eb756 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2746,9 +2746,9 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
 				cmd->CmdConfig |= I2C_CMD_WRITE;
 				cmd->RegisterAddr = msg->buf[j];
 			}
-			if ((msg[i].flags & I2C_M_STOP) ||
-			    (!remaining_bytes))
+			if (!remaining_bytes)
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
+
 			if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
 				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
 		}
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index e8e57462ce9d64..2fa667a86c1a54 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -3434,9 +3434,9 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
 				cmd->CmdConfig |= CMDCONFIG_READWRITE_MASK;
 				cmd->ReadWriteData = msg->buf[j];
 			}
-			if ((msg[i].flags & I2C_M_STOP) ||
-			    (!remaining_bytes))
+			if (!remaining_bytes)
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
+
 			if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
 				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
 		}
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 16/40] drm/amd/pm: SMU I2C: Return number of messages processed
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (14 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 15/40] drm/amdgpu: Send STOP for the last byte of msg only Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 20:25   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 17/40] drm/amdgpu/pm: ADD I2C quirk adapter table Luben Tuikov
                   ` (23 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Fix from number of processed bytes to number of
processed I2C messages.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 43 +++++++++++--------
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 43 +++++++++++--------
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 43 +++++++++++--------
 3 files changed, 75 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index 235e83e9f0feb7..409299a608e1b3 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -1913,9 +1913,8 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
 	struct smu_table_context *smu_table = &adev->smu.smu_table;
 	struct smu_table *table = &smu_table->driver_table;
 	SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
-	u16 bytes_to_transfer, remaining_bytes, msg_bytes;
-	u16 available_bytes = MAX_SW_I2C_COMMANDS;
-	int i, j, r, c;
+	short available_bytes = MAX_SW_I2C_COMMANDS;
+	int i, j, r, c, num_done = 0;
 	u8 slave;
 
 	/* only support a single slave addr per transaction */
@@ -1923,8 +1922,15 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
 	for (i = 0; i < num; i++) {
 		if (slave != msgs[i].addr)
 			return -EINVAL;
-		bytes_to_transfer += min(msgs[i].len, available_bytes);
-		available_bytes -= bytes_to_transfer;
+
+		available_bytes -= msgs[i].len;
+		if (available_bytes >= 0) {
+			num_done++;
+		} else {
+			/* This message and all the follwing won't be processed */
+			available_bytes += msgs[i].len;
+			break;
+		}
 	}
 
 	req = kzalloc(sizeof(*req), GFP_KERNEL);
@@ -1934,24 +1940,28 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
 	req->I2CcontrollerPort = 1;
 	req->I2CSpeed = I2C_SPEED_FAST_400K;
 	req->SlaveAddress = slave << 1; /* 8 bit addresses */
-	req->NumCmds = bytes_to_transfer;
+	req->NumCmds = MAX_SW_I2C_COMMANDS - available_bytes;;
 
-	remaining_bytes = bytes_to_transfer;
 	c = 0;
-	for (i = 0; i < num; i++) {
+	for (i = 0; i < num_done; i++) {
 		struct i2c_msg *msg = &msgs[i];
 
-		msg_bytes = min(msg->len, remaining_bytes);
-		for (j = 0; j < msg_bytes; j++) {
+		for (j = 0; j < msg->len; j++) {
 			SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
 
-			remaining_bytes--;
 			if (!(msg[i].flags & I2C_M_RD)) {
 				/* write */
 				cmd->CmdConfig |= I2C_CMD_WRITE;
 				cmd->RegisterAddr = msg->buf[j];
 			}
-			if (!remaining_bytes)
+
+			/*
+			 * Insert STOP if we are at the last byte of either last
+			 * message for the transaction or the client explicitly
+			 * requires a STOP at this particular message.
+			 */
+			if ((j == msg->len -1 ) &&
+			    ((i == num_done - 1) || (msg[i].flags & I2C_M_STOP)))
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
 
 			if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
@@ -1964,21 +1974,18 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
 	if (r)
 		goto fail;
 
-	remaining_bytes = bytes_to_transfer;
 	c = 0;
-	for (i = 0; i < num; i++) {
+	for (i = 0; i < num_done; i++) {
 		struct i2c_msg *msg = &msgs[i];
 
-		msg_bytes = min(msg->len, remaining_bytes);
-		for (j = 0; j < msg_bytes; j++) {
+		for (j = 0; j < msg->len; j++) {
 			SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
 
-			remaining_bytes--;
 			if (msg[i].flags & I2C_M_RD)
 				msg->buf[j] = cmd->Data;
 		}
 	}
-	r = bytes_to_transfer;
+	r = num_done;
 
 fail:
 	kfree(req);
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index b94c5a1d3eb756..4010b891f25678 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2708,9 +2708,8 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
 	struct smu_table_context *smu_table = &adev->smu.smu_table;
 	struct smu_table *table = &smu_table->driver_table;
 	SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
-	u16 bytes_to_transfer, remaining_bytes, msg_bytes;
-	u16 available_bytes = MAX_SW_I2C_COMMANDS;
-	int i, j, r, c;
+	short available_bytes = MAX_SW_I2C_COMMANDS;
+	int i, j, r, c, num_done = 0;
 	u8 slave;
 
 	/* only support a single slave addr per transaction */
@@ -2718,8 +2717,15 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
 	for (i = 0; i < num; i++) {
 		if (slave != msgs[i].addr)
 			return -EINVAL;
-		bytes_to_transfer += min(msgs[i].len, available_bytes);
-		available_bytes -= bytes_to_transfer;
+
+		available_bytes -= msgs[i].len;
+		if (available_bytes >= 0) {
+			num_done++;
+		} else {
+			/* This message and all the follwing won't be processed */
+			available_bytes += msgs[i].len;
+			break;
+		}
 	}
 
 	req = kzalloc(sizeof(*req), GFP_KERNEL);
@@ -2729,24 +2735,28 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
 	req->I2CcontrollerPort = 1;
 	req->I2CSpeed = I2C_SPEED_FAST_400K;
 	req->SlaveAddress = slave << 1; /* 8 bit addresses */
-	req->NumCmds = bytes_to_transfer;
+	req->NumCmds = MAX_SW_I2C_COMMANDS - available_bytes;;
 
-	remaining_bytes = bytes_to_transfer;
 	c = 0;
-	for (i = 0; i < num; i++) {
+	for (i = 0; i < num_done; i++) {
 		struct i2c_msg *msg = &msgs[i];
 
-		msg_bytes = min(msg->len, remaining_bytes);
-		for (j = 0; j < msg_bytes; j++) {
+		for (j = 0; j < msg->len; j++) {
 			SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
 
-			remaining_bytes--;
 			if (!(msg[i].flags & I2C_M_RD)) {
 				/* write */
 				cmd->CmdConfig |= I2C_CMD_WRITE;
 				cmd->RegisterAddr = msg->buf[j];
 			}
-			if (!remaining_bytes)
+
+			/*
+			 * Insert STOP if we are at the last byte of either last
+			 * message for the transaction or the client explicitly
+			 * requires a STOP at this particular message.
+			 */
+			if ((j == msg->len -1 ) &&
+			    ((i == num_done - 1) || (msg[i].flags & I2C_M_STOP)))
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
 
 			if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
@@ -2759,21 +2769,18 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
 	if (r)
 		goto fail;
 
-	remaining_bytes = bytes_to_transfer;
 	c = 0;
-	for (i = 0; i < num; i++) {
+	for (i = 0; i < num_done; i++) {
 		struct i2c_msg *msg = &msgs[i];
 
-		msg_bytes = min(msg->len, remaining_bytes);
-		for (j = 0; j < msg_bytes; j++) {
+		for (j = 0; j < msg->len; j++) {
 			SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
 
-			remaining_bytes--;
 			if (msg[i].flags & I2C_M_RD)
 				msg->buf[j] = cmd->Data;
 		}
 	}
-	r = bytes_to_transfer;
+	r = num_done;
 
 fail:
 	kfree(req);
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 2fa667a86c1a54..d5b750d84112fa 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -3396,9 +3396,8 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
 	struct smu_table_context *smu_table = &adev->smu.smu_table;
 	struct smu_table *table = &smu_table->driver_table;
 	SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
-	u16 bytes_to_transfer, remaining_bytes, msg_bytes;
-	u16 available_bytes = MAX_SW_I2C_COMMANDS;
-	int i, j, r, c;
+	short available_bytes = MAX_SW_I2C_COMMANDS;
+	int i, j, r, c, num_done = 0;
 	u8 slave;
 
 	/* only support a single slave addr per transaction */
@@ -3406,8 +3405,15 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
 	for (i = 0; i < num; i++) {
 		if (slave != msgs[i].addr)
 			return -EINVAL;
-		bytes_to_transfer += min(msgs[i].len, available_bytes);
-		available_bytes -= bytes_to_transfer;
+
+		available_bytes -= msgs[i].len;
+		if (available_bytes >= 0) {
+			num_done++;
+		} else {
+			/* This message and all the follwing won't be processed */
+			available_bytes += msgs[i].len;
+			break;
+		}
 	}
 
 	req = kzalloc(sizeof(*req), GFP_KERNEL);
@@ -3417,24 +3423,28 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
 	req->I2CcontrollerPort = 1;
 	req->I2CSpeed = I2C_SPEED_FAST_400K;
 	req->SlaveAddress = slave << 1; /* 8 bit addresses */
-	req->NumCmds = bytes_to_transfer;
+	req->NumCmds = MAX_SW_I2C_COMMANDS - available_bytes;;
 
-	remaining_bytes = bytes_to_transfer;
 	c = 0;
-	for (i = 0; i < num; i++) {
+	for (i = 0; i < num_done; i++) {
 		struct i2c_msg *msg = &msgs[i];
 
-		msg_bytes = min(msg->len, remaining_bytes);
-		for (j = 0; j < msg_bytes; j++) {
+		for (j = 0; j < msg->len; j++) {
 			SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
 
-			remaining_bytes--;
 			if (!(msg[i].flags & I2C_M_RD)) {
 				/* write */
 				cmd->CmdConfig |= CMDCONFIG_READWRITE_MASK;
 				cmd->ReadWriteData = msg->buf[j];
 			}
-			if (!remaining_bytes)
+
+			/*
+			 * Insert STOP if we are at the last byte of either last
+			 * message for the transaction or the client explicitly
+			 * requires a STOP at this particular message.
+			 */
+			if ((j == msg->len -1 ) &&
+			    ((i == num_done - 1) || (msg[i].flags & I2C_M_STOP)))
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
 
 			if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
@@ -3447,21 +3457,18 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
 	if (r)
 		goto fail;
 
-	remaining_bytes = bytes_to_transfer;
 	c = 0;
-	for (i = 0; i < num; i++) {
+	for (i = 0; i < num_done; i++) {
 		struct i2c_msg *msg = &msgs[i];
 
-		msg_bytes = min(msg->len, remaining_bytes);
-		for (j = 0; j < msg_bytes; j++) {
+		for (j = 0; j < msg->len; j++) {
 			SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
 
-			remaining_bytes--;
 			if (msg[i].flags & I2C_M_RD)
 				msg->buf[j] = cmd->ReadWriteData;
 		}
 	}
-	r = bytes_to_transfer;
+	r = num_done;
 
 fail:
 	kfree(req);
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 17/40] drm/amdgpu/pm: ADD I2C quirk adapter table
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (15 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 16/40] drm/amd/pm: SMU I2C: Return number of messages processed Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 20:26   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 18/40] drm/amdgpu: Fix Vega20 I2C to be agnostic (v2) Luben Tuikov
                   ` (22 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

To be used by kernel clients of the adapter.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Lazar Lijo <Lijo.Lazar@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 7 +++++++
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 6 ++++++
 drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 6 ++++++
 3 files changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index 409299a608e1b3..c2d6d7c8129593 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -2004,6 +2004,12 @@ static const struct i2c_algorithm arcturus_i2c_algo = {
 	.functionality = arcturus_i2c_func,
 };
 
+
+static const struct i2c_adapter_quirks arcturus_i2c_control_quirks = {
+	.max_read_len = MAX_SW_I2C_COMMANDS,
+	.max_write_len = MAX_SW_I2C_COMMANDS,
+};
+
 static int arcturus_i2c_control_init(struct smu_context *smu, struct i2c_adapter *control)
 {
 	struct amdgpu_device *adev = to_amdgpu_device(control);
@@ -2013,6 +2019,7 @@ static int arcturus_i2c_control_init(struct smu_context *smu, struct i2c_adapter
 	control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
 	control->dev.parent = &adev->pdev->dev;
 	control->algo = &arcturus_i2c_algo;
+	control->quirks = &arcturus_i2c_control_quirks;
 	snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
 
 	res = i2c_add_adapter(control);
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 4010b891f25678..56000463f64e45 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2799,6 +2799,11 @@ static const struct i2c_algorithm navi10_i2c_algo = {
 	.functionality = navi10_i2c_func,
 };
 
+static const struct i2c_adapter_quirks navi10_i2c_control_quirks = {
+	.max_read_len = MAX_SW_I2C_COMMANDS,
+	.max_write_len = MAX_SW_I2C_COMMANDS,
+};
+
 static int navi10_i2c_control_init(struct smu_context *smu, struct i2c_adapter *control)
 {
 	struct amdgpu_device *adev = to_amdgpu_device(control);
@@ -2809,6 +2814,7 @@ static int navi10_i2c_control_init(struct smu_context *smu, struct i2c_adapter *
 	control->dev.parent = &adev->pdev->dev;
 	control->algo = &navi10_i2c_algo;
 	snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
+	control->quirks = &navi10_i2c_control_quirks;
 
 	res = i2c_add_adapter(control);
 	if (res)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index d5b750d84112fa..86804f3b0a951b 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -3487,6 +3487,11 @@ static const struct i2c_algorithm sienna_cichlid_i2c_algo = {
 	.functionality = sienna_cichlid_i2c_func,
 };
 
+static const struct i2c_adapter_quirks sienna_cichlid_i2c_control_quirks = {
+	.max_read_len = MAX_SW_I2C_COMMANDS,
+	.max_write_len = MAX_SW_I2C_COMMANDS,
+};
+
 static int sienna_cichlid_i2c_control_init(struct smu_context *smu, struct i2c_adapter *control)
 {
 	struct amdgpu_device *adev = to_amdgpu_device(control);
@@ -3497,6 +3502,7 @@ static int sienna_cichlid_i2c_control_init(struct smu_context *smu, struct i2c_a
 	control->dev.parent = &adev->pdev->dev;
 	control->algo = &sienna_cichlid_i2c_algo;
 	snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
+	control->quirks = &sienna_cichlid_i2c_control_quirks;
 
 	res = i2c_add_adapter(control);
 	if (res)
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 18/40] drm/amdgpu: Fix Vega20 I2C to be agnostic (v2)
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (16 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 17/40] drm/amdgpu/pm: ADD I2C quirk adapter table Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 20:43   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 19/40] drm/amdgpu: Fixes to the AMDGPU EEPROM driver Luben Tuikov
                   ` (21 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

Teach Vega20 I2C to be agnostic. Allow addressing
different devices while the master holds the bus.
Set STOP as per the controller's specification.

v2: Qualify generating ReSTART before the 1st byte
    of the message, when set by the caller, as
    those functions are separated, as caught by
    Andrey G.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c |   4 +-
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c | 105 +++++++++++++--------
 2 files changed, 69 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
index fe0e9b0c4d5a38..d02ea083a6c69b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
@@ -41,10 +41,10 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
 		},
 		{
 			.addr = slave_addr,
-			.flags = read ? I2C_M_RD: 0,
+			.flags = read ? I2C_M_RD : 0,
 			.len = bytes,
 			.buf = eeprom_buf,
-		}
+		},
 	};
 	int r;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
index 5a90d9351b22eb..b8d6d308fb06a0 100644
--- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
@@ -41,9 +41,7 @@
 #define I2C_SW_TIMEOUT        8
 #define I2C_ABORT             0x10
 
-/* I2C transaction flags */
-#define I2C_NO_STOP	1
-#define I2C_RESTART	2
+#define I2C_X_RESTART         BIT(31)
 
 #define to_amdgpu_device(x) (container_of(x, struct amdgpu_device, pm.smu_i2c))
 
@@ -205,9 +203,6 @@ static uint32_t smu_v11_0_i2c_poll_rx_status(struct i2c_adapter *control)
 	return ret;
 }
 
-
-
-
 /**
  * smu_v11_0_i2c_transmit - Send a block of data over the I2C bus to a slave device.
  *
@@ -252,21 +247,22 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
 		reg = RREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_STATUS);
 		if (REG_GET_FIELD(reg, CKSVII2C_IC_STATUS, TFNF)) {
 			do {
-				reg = 0;
-				/*
-				 * Prepare transaction, no need to set RESTART. I2C engine will send
-				 * START as soon as it sees data in TXFIFO
-				 */
-				if (bytes_sent == 0)
-					reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, RESTART,
-							    (i2c_flag & I2C_RESTART) ? 1 : 0);
 				reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, DAT, data[bytes_sent]);
 
-				/* determine if we need to send STOP bit or not */
-				if (numbytes == 1)
-					/* Final transaction, so send stop unless I2C_NO_STOP */
-					reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, STOP,
-							    (i2c_flag & I2C_NO_STOP) ? 0 : 1);
+				/* Final message, final byte, must
+				 * generate a STOP, to release the
+				 * bus, i.e. don't hold SCL low.
+				 */
+				if (numbytes == 1 && i2c_flag & I2C_M_STOP)
+					reg = REG_SET_FIELD(reg,
+							    CKSVII2C_IC_DATA_CMD,
+							    STOP, 1);
+
+				if (bytes_sent == 0 && i2c_flag & I2C_X_RESTART)
+					reg = REG_SET_FIELD(reg,
+							    CKSVII2C_IC_DATA_CMD,
+							    RESTART, 1);
+
 				/* Write */
 				reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, CMD, 0);
 				WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_DATA_CMD, reg);
@@ -341,23 +337,21 @@ static uint32_t smu_v11_0_i2c_receive(struct i2c_adapter *control,
 
 		smu_v11_0_i2c_clear_status(control);
 
-
 		/* Prepare transaction */
-
-		/* Each time we disable I2C, so this is not a restart */
-		if (bytes_received == 0)
-			reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, RESTART,
-					    (i2c_flag & I2C_RESTART) ? 1 : 0);
-
 		reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, DAT, 0);
 		/* Read */
 		reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, CMD, 1);
 
-		/* Transmitting last byte */
-		if (numbytes == 1)
-			/* Final transaction, so send stop if requested */
-			reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, STOP,
-					    (i2c_flag & I2C_NO_STOP) ? 0 : 1);
+		/* Final message, final byte, must generate a STOP
+		 * to release the bus, i.e. don't hold SCL low.
+		 */
+		if (numbytes == 1 && i2c_flag & I2C_M_STOP)
+			reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD,
+					    STOP, 1);
+
+		if (bytes_received == 0 && i2c_flag & I2C_X_RESTART)
+			reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD,
+					    RESTART, 1);
 
 		WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_DATA_CMD, reg);
 
@@ -591,23 +585,59 @@ static const struct i2c_lock_operations smu_v11_0_i2c_i2c_lock_ops = {
 };
 
 static int smu_v11_0_i2c_xfer(struct i2c_adapter *i2c_adap,
-			      struct i2c_msg *msgs, int num)
+			      struct i2c_msg *msg, int num)
 {
 	int i, ret;
+	u16 addr, dir;
 
 	smu_v11_0_i2c_init(i2c_adap);
 
+	/* From the client's point of view, this sequence of
+	 * messages-- the array i2c_msg *msg, is a single transaction
+	 * on the bus, starting with START and ending with STOP.
+	 *
+	 * The client is welcome to send any sequence of messages in
+	 * this array, as processing under this function here is
+	 * striving to be agnostic.
+	 *
+	 * Record the first address and direction we see. If either
+	 * changes for a subsequent message, generate ReSTART. The
+	 * DW_apb_i2c databook, v1.21a, specifies that ReSTART is
+	 * generated when the direction changes, with the default IP
+	 * block parameter settings, but it doesn't specify if ReSTART
+	 * is generated when the address changes (possibly...). We
+	 * don't rely on the default IP block parameter settings as
+	 * the block is shared and they may change.
+	 */
+	if (num > 0) {
+		addr = msg[0].addr;
+		dir  = msg[0].flags & I2C_M_RD;
+	}
+
 	for (i = 0; i < num; i++) {
-		uint32_t i2c_flag = ((msgs[i].flags & I2C_M_NOSTART) ? 0 : I2C_RESTART) ||
-				    (((msgs[i].flags & I2C_M_STOP) ? 0 : I2C_NO_STOP));
+		u32 i2c_flag = 0;
 
-		if (msgs[i].flags & I2C_M_RD)
+		if (msg[i].addr != addr || (msg[i].flags ^ dir) & I2C_M_RD) {
+			addr = msg[i].addr;
+			dir  = msg[i].flags & I2C_M_RD;
+			i2c_flag |= I2C_X_RESTART;
+		}
+
+		if (i == num - 1) {
+			/* Set the STOP bit on the last message, so
+			 * that the IP block generates a STOP after
+			 * the last byte of the message.
+			 */
+			i2c_flag |= I2C_M_STOP;
+		}
+
+		if (msg[i].flags & I2C_M_RD)
 			ret = smu_v11_0_i2c_read_data(i2c_adap,
-						      msgs + i,
+						      msg + i,
 						      i2c_flag);
 		else
 			ret = smu_v11_0_i2c_write_data(i2c_adap,
-						       msgs + i,
+						       msg + i,
 						       i2c_flag);
 
 		if (ret != I2C_OK) {
@@ -625,7 +655,6 @@ static u32 smu_v11_0_i2c_func(struct i2c_adapter *adap)
 	return I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL;
 }
 
-
 static const struct i2c_algorithm smu_v11_0_i2c_algo = {
 	.master_xfer = smu_v11_0_i2c_xfer,
 	.functionality = smu_v11_0_i2c_func,
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 19/40] drm/amdgpu: Fixes to the AMDGPU EEPROM driver
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (17 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 18/40] drm/amdgpu: Fix Vega20 I2C to be agnostic (v2) Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 20:53   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 20/40] drm/amdgpu: EEPROM respects I2C quirks Luben Tuikov
                   ` (20 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

* When reading from the EEPROM device, there is no
  device limitation on the number of bytes
  read--they're simply sequenced out. Thus, read
  the whole data requested in one go.

* When writing to the EEPROM device, there is a
  256-byte page limit to write to before having to
  generate a STOP on the bus, as well as the
  address written to mustn't cross over the page
  boundary (it actually rolls over). Maximize the
  data written to per bus acquisition.

* Return the number of bytes read/written, or -errno.

* Add kernel doc.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c | 96 +++++++++++++++-------
 1 file changed, 68 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
index d02ea083a6c69b..7fdb5bd2fc8bc8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
@@ -24,59 +24,99 @@
 #include "amdgpu_eeprom.h"
 #include "amdgpu.h"
 
-#define EEPROM_OFFSET_LENGTH 2
+/* AT24CM02 has a 256-byte write page size.
+ */
+#define EEPROM_PAGE_BITS   8
+#define EEPROM_PAGE_SIZE   (1U << EEPROM_PAGE_BITS)
+#define EEPROM_PAGE_MASK   (EEPROM_PAGE_SIZE - 1)
+
+#define EEPROM_OFFSET_SIZE 2
 
+/**
+ * amdgpu_eeprom_xfer -- Read/write from/to an I2C EEPROM device
+ * @i2c_adap: pointer to the I2C adapter to use
+ * @slave_addr: I2C address of the slave device
+ * @eeprom_addr: EEPROM address from which to read/write
+ * @eeprom_buf: pointer to data buffer to read into/write from
+ * @buf_size: the size of @eeprom_buf
+ * @read: True if reading from the EEPROM, false if writing
+ *
+ * Returns the number of bytes read/written; -errno on error.
+ */
 int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
 		       u16 slave_addr, u16 eeprom_addr,
-		       u8 *eeprom_buf, u16 bytes, bool read)
+		       u8 *eeprom_buf, u16 buf_size, bool read)
 {
-	u8 eeprom_offset_buf[2];
-	u16 bytes_transferred;
+	u8 eeprom_offset_buf[EEPROM_OFFSET_SIZE];
 	struct i2c_msg msgs[] = {
 		{
 			.addr = slave_addr,
 			.flags = 0,
-			.len = EEPROM_OFFSET_LENGTH,
+			.len = EEPROM_OFFSET_SIZE,
 			.buf = eeprom_offset_buf,
 		},
 		{
 			.addr = slave_addr,
 			.flags = read ? I2C_M_RD : 0,
-			.len = bytes,
-			.buf = eeprom_buf,
 		},
 	};
+	const u8 *p = eeprom_buf;
 	int r;
+	u16 len;
+
+	r = 0;
+	for (len = 0; buf_size > 0;
+	     buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
+		/* Set the EEPROM address we want to write to/read from.
+		 */
+		msgs[0].buf[0] = (eeprom_addr >> 8) & 0xff;
+		msgs[0].buf[1] = eeprom_addr & 0xff;
 
-	msgs[0].buf[0] = ((eeprom_addr >> 8) & 0xff);
-	msgs[0].buf[1] = (eeprom_addr & 0xff);
+		if (!read) {
+			/* Write the maximum amount of data, without
+			 * crossing the device's page boundary, as per
+			 * its spec. Partial page writes are allowed,
+			 * starting at any location within the page,
+			 * so long as the page boundary isn't crossed
+			 * over (actually the page pointer rolls
+			 * over).
+			 *
+			 * As per the AT24CM02 EEPROM spec, after
+			 * writing into a page, the I2C driver MUST
+			 * terminate the transfer, i.e. in
+			 * "i2c_transfer()" below, with a STOP
+			 * condition, so that the self-timed write
+			 * cycle begins. This is implied for the
+			 * "i2c_transfer()" abstraction.
+			 */
+			len = min(EEPROM_PAGE_SIZE - (eeprom_addr &
+						      EEPROM_PAGE_MASK),
+				  (u32)buf_size);
+		} else {
+			/* Reading from the EEPROM has no limitation
+			 * on the number of bytes read from the EEPROM
+			 * device--they are simply sequenced out.
+			 */
+			len = buf_size;
+		}
+		msgs[1].len = len;
+		msgs[1].buf = eeprom_buf;
 
-	while (msgs[1].len) {
 		r = i2c_transfer(i2c_adap, msgs, ARRAY_SIZE(msgs));
-		if (r <= 0)
-			return r;
+		if (r < ARRAY_SIZE(msgs))
+			break;
 
-		/* Only for write data */
-		if (!msgs[1].flags)
-			/*
-			 * According to EEPROM spec there is a MAX of 10 ms required for
-			 * EEPROM to flush internal RX buffer after STOP was issued at the
-			 * end of write transaction. During this time the EEPROM will not be
-			 * responsive to any more commands - so wait a bit more.
+		if (!read) {
+			/* According to the AT24CM02 EEPROM spec the
+			 * length of the self-writing cycle, tWR, is
+			 * 10 ms.
 			 *
 			 * TODO Improve to wait for first ACK for slave address after
 			 * internal write cycle done.
 			 */
 			msleep(10);
-
-
-		bytes_transferred = r - EEPROM_OFFSET_LENGTH;
-		eeprom_addr += bytes_transferred;
-		msgs[0].buf[0] = ((eeprom_addr >> 8) & 0xff);
-		msgs[0].buf[1] = (eeprom_addr & 0xff);
-		msgs[1].buf += bytes_transferred;
-		msgs[1].len -= bytes_transferred;
+		}
 	}
 
-	return 0;
+	return r < 0 ? r : eeprom_buf - p;
 }
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 20/40] drm/amdgpu: EEPROM respects I2C quirks
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (18 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 19/40] drm/amdgpu: Fixes to the AMDGPU EEPROM driver Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-11 17:01   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 21/40] drm/amdgpu: I2C EEPROM full memory addressing Luben Tuikov
                   ` (19 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

Consult the i2c_adapter.quirks table for
the maximum read/write data length per bus
transaction. Do not exceed this transaction
limit.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c | 80 +++++++++++++++++-----
 1 file changed, 64 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
index 7fdb5bd2fc8bc8..94aeda1c7f8ca0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
@@ -32,20 +32,9 @@
 
 #define EEPROM_OFFSET_SIZE 2
 
-/**
- * amdgpu_eeprom_xfer -- Read/write from/to an I2C EEPROM device
- * @i2c_adap: pointer to the I2C adapter to use
- * @slave_addr: I2C address of the slave device
- * @eeprom_addr: EEPROM address from which to read/write
- * @eeprom_buf: pointer to data buffer to read into/write from
- * @buf_size: the size of @eeprom_buf
- * @read: True if reading from the EEPROM, false if writing
- *
- * Returns the number of bytes read/written; -errno on error.
- */
-int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
-		       u16 slave_addr, u16 eeprom_addr,
-		       u8 *eeprom_buf, u16 buf_size, bool read)
+static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
+				u16 slave_addr, u16 eeprom_addr,
+				u8 *eeprom_buf, u16 buf_size, bool read)
 {
 	u8 eeprom_offset_buf[EEPROM_OFFSET_SIZE];
 	struct i2c_msg msgs[] = {
@@ -65,8 +54,8 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
 	u16 len;
 
 	r = 0;
-	for (len = 0; buf_size > 0;
-	     buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
+	for ( ; buf_size > 0;
+	      buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
 		/* Set the EEPROM address we want to write to/read from.
 		 */
 		msgs[0].buf[0] = (eeprom_addr >> 8) & 0xff;
@@ -120,3 +109,62 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
 
 	return r < 0 ? r : eeprom_buf - p;
 }
+
+/**
+ * amdgpu_eeprom_xfer -- Read/write from/to an I2C EEPROM device
+ * @i2c_adap: pointer to the I2C adapter to use
+ * @slave_addr: I2C address of the slave device
+ * @eeprom_addr: EEPROM address from which to read/write
+ * @eeprom_buf: pointer to data buffer to read into/write from
+ * @buf_size: the size of @eeprom_buf
+ * @read: True if reading from the EEPROM, false if writing
+ *
+ * Returns the number of bytes read/written; -errno on error.
+ */
+int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
+		       u16 slave_addr, u16 eeprom_addr,
+		       u8 *eeprom_buf, u16 buf_size, bool read)
+{
+	const struct i2c_adapter_quirks *quirks = i2c_adap->quirks;
+	u16 limit;
+
+	if (!quirks)
+		limit = 0;
+	else if (read)
+		limit = quirks->max_read_len;
+	else
+		limit = quirks->max_write_len;
+
+	if (limit == 0) {
+		return __amdgpu_eeprom_xfer(i2c_adap, slave_addr, eeprom_addr,
+					    eeprom_buf, buf_size, read);
+	} else if (limit <= EEPROM_OFFSET_SIZE) {
+		dev_err_ratelimited(&i2c_adap->dev,
+				    "maddr:0x%04X size:0x%02X:quirk max_%s_len must be > %d",
+				    eeprom_addr, buf_size,
+				    read ? "read" : "write", EEPROM_OFFSET_SIZE);
+		return -EINVAL;
+	} else {
+		u16 ps; /* Partial size */
+		int res = 0, r;
+
+		/* The "limit" includes all data bytes sent/received,
+		 * which would include the EEPROM_OFFSET_SIZE bytes.
+		 * Account for them here.
+		 */
+		limit -= EEPROM_OFFSET_SIZE;
+		for ( ; buf_size > 0;
+		      buf_size -= ps, eeprom_addr += ps, eeprom_buf += ps) {
+			ps = min(limit, buf_size);
+
+			r = __amdgpu_eeprom_xfer(i2c_adap,
+						 slave_addr, eeprom_addr,
+						 eeprom_buf, ps, read);
+			if (r < 0)
+				return r;
+			res += r;
+		}
+
+		return res;
+	}
+}
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 21/40] drm/amdgpu: I2C EEPROM full memory addressing
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (19 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 20/40] drm/amdgpu: EEPROM respects I2C quirks Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 20:57   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 22/40] drm/amdgpu: RAS and FRU now use 19-bit I2C address Luben Tuikov
                   ` (18 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

* "eeprom_addr" is now 32-bit wide.
* Remove "slave_addr" from the I2C EEPROM driver
  interface. The I2C EEPROM Device Type Identifier
  is fixed at 1010b, and the rest of the bits
  of the Device Address Byte/Device Select Code,
  are memory address bits, where the first three
  of those bits are the hardware selection bits.
  All this is now a 19-bit address and passed
  as "eeprom_addr". This abstracts the I2C bus
  for EEPROM devices for this I2C EEPROM driver.
  Now clients only pass the 19-bit EEPROM memory
  address, to the I2C EEPROM driver, as the 32-bit
  "eeprom_addr", from which they want to read from
  or write to.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c | 88 +++++++++++++++++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h |  4 +-
 2 files changed, 72 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
index 94aeda1c7f8ca0..a5a87affedabf1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
@@ -24,7 +24,7 @@
 #include "amdgpu_eeprom.h"
 #include "amdgpu.h"
 
-/* AT24CM02 has a 256-byte write page size.
+/* AT24CM02 and M24M02-R have a 256-byte write page size.
  */
 #define EEPROM_PAGE_BITS   8
 #define EEPROM_PAGE_SIZE   (1U << EEPROM_PAGE_BITS)
@@ -32,20 +32,72 @@
 
 #define EEPROM_OFFSET_SIZE 2
 
-static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
-				u16 slave_addr, u16 eeprom_addr,
+/* EEPROM memory addresses are 19-bits long, which can
+ * be partitioned into 3, 8, 8 bits, for a total of 19.
+ * The upper 3 bits are sent as part of the 7-bit
+ * "Device Type Identifier"--an I2C concept, which for EEPROM devices
+ * is hard-coded as 1010b, indicating that it is an EEPROM
+ * device--this is the wire format, followed by the upper
+ * 3 bits of the 19-bit address, followed by the direction,
+ * followed by two bytes holding the rest of the 16-bits of
+ * the EEPROM memory address. The format on the wire for EEPROM
+ * devices is: 1010XYZD, A15:A8, A7:A0,
+ * Where D is the direction and sequenced out by the hardware.
+ * Bits XYZ are memory address bits 18, 17 and 16.
+ * These bits are compared to how pins 1-3 of the part are connected,
+ * depending on the size of the part, more on that later.
+ *
+ * Note that of this wire format, a client is in control
+ * of, and needs to specify only XYZ, A15:A8, A7:0, bits,
+ * which is exactly the EEPROM memory address, or offset,
+ * in order to address up to 8 EEPROM devices on the I2C bus.
+ *
+ * For instance, a 2-Mbit I2C EEPROM part, addresses all its bytes,
+ * using an 18-bit address, bit 17 to 0 and thus would use all but one bit of
+ * the 19 bits previously mentioned. The designer would then not connect
+ * pins 1 and 2, and pin 3 usually named "A_2" or "E2", would be connected to
+ * either Vcc or GND. This would allow for up to two 2-Mbit parts on
+ * the same bus, where one would be addressable with bit 18 as 1, and
+ * the other with bit 18 of the address as 0.
+ *
+ * For a 2-Mbit part, bit 18 is usually known as the "Chip Enable" or
+ * "Hardware Address Bit". This bit is compared to the load on pin 3
+ * of the device, described above, and if there is a match, then this
+ * device responds to the command. This way, you can connect two
+ * 2-Mbit EEPROM devices on the same bus, but see one contiguous
+ * memory from 0 to 7FFFFh, where address 0 to 3FFFF is in the device
+ * whose pin 3 is connected to GND, and address 40000 to 7FFFFh is in
+ * the 2nd device, whose pin 3 is connected to Vcc.
+ *
+ * This addressing you encode in the 32-bit "eeprom_addr" below,
+ * namely the 19-bits "XYZ,A15:A0", as a single 19-bit address. For
+ * instance, eeprom_addr = 0x6DA01, is 110_1101_1010_0000_0001, where
+ * XYZ=110b, and A15:A0=DA01h. The XYZ bits become part of the device
+ * address, and the rest of the address bits are sent as the memory
+ * address bytes.
+ *
+ * That is, for an I2C EEPROM driver everything is controlled by
+ * the "eeprom_addr".
+ *
+ * P.S. If you need to write, lock and read the Identification Page,
+ * (M24M02-DR device only, which we do not use), change the "7" to
+ * "0xF" in the macro below, and let the client set bit 20 to 1 in
+ * "eeprom_addr", and set A10 to 0 to write into it, and A10 and A1 to
+ * 1 to lock it permanently.
+ */
+#define MAKE_I2C_ADDR(_aa) ((0xA << 3) | (((_aa) >> 16) & 7))
+
+static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
 				u8 *eeprom_buf, u16 buf_size, bool read)
 {
 	u8 eeprom_offset_buf[EEPROM_OFFSET_SIZE];
 	struct i2c_msg msgs[] = {
 		{
-			.addr = slave_addr,
 			.flags = 0,
 			.len = EEPROM_OFFSET_SIZE,
 			.buf = eeprom_offset_buf,
 		},
 		{
-			.addr = slave_addr,
 			.flags = read ? I2C_M_RD : 0,
 		},
 	};
@@ -58,6 +110,8 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
 	      buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
 		/* Set the EEPROM address we want to write to/read from.
 		 */
+		msgs[0].addr = MAKE_I2C_ADDR(eeprom_addr);
+		msgs[1].addr = msgs[0].addr;
 		msgs[0].buf[0] = (eeprom_addr >> 8) & 0xff;
 		msgs[0].buf[1] = eeprom_addr & 0xff;
 
@@ -71,7 +125,7 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
 			 * over).
 			 *
 			 * As per the AT24CM02 EEPROM spec, after
-			 * writing into a page, the I2C driver MUST
+			 * writing into a page, the I2C driver should
 			 * terminate the transfer, i.e. in
 			 * "i2c_transfer()" below, with a STOP
 			 * condition, so that the self-timed write
@@ -91,17 +145,20 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
 		msgs[1].len = len;
 		msgs[1].buf = eeprom_buf;
 
+		/* This constitutes a START-STOP transaction.
+		 */
 		r = i2c_transfer(i2c_adap, msgs, ARRAY_SIZE(msgs));
 		if (r < ARRAY_SIZE(msgs))
 			break;
 
 		if (!read) {
-			/* According to the AT24CM02 EEPROM spec the
-			 * length of the self-writing cycle, tWR, is
-			 * 10 ms.
+			/* According to EEPROM specs the length of the
+			 * self-writing cycle, tWR (tW), is 10 ms.
 			 *
-			 * TODO Improve to wait for first ACK for slave address after
-			 * internal write cycle done.
+			 * TODO: Use polling on ACK, aka Acknowledge
+			 * Polling, to minimize waiting for the
+			 * internal write cycle to complete, as it is
+			 * usually smaller than tWR (tW).
 			 */
 			msleep(10);
 		}
@@ -113,7 +170,6 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
 /**
  * amdgpu_eeprom_xfer -- Read/write from/to an I2C EEPROM device
  * @i2c_adap: pointer to the I2C adapter to use
- * @slave_addr: I2C address of the slave device
  * @eeprom_addr: EEPROM address from which to read/write
  * @eeprom_buf: pointer to data buffer to read into/write from
  * @buf_size: the size of @eeprom_buf
@@ -121,8 +177,7 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
  *
  * Returns the number of bytes read/written; -errno on error.
  */
-int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
-		       u16 slave_addr, u16 eeprom_addr,
+int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
 		       u8 *eeprom_buf, u16 buf_size, bool read)
 {
 	const struct i2c_adapter_quirks *quirks = i2c_adap->quirks;
@@ -136,7 +191,7 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
 		limit = quirks->max_write_len;
 
 	if (limit == 0) {
-		return __amdgpu_eeprom_xfer(i2c_adap, slave_addr, eeprom_addr,
+		return __amdgpu_eeprom_xfer(i2c_adap, eeprom_addr,
 					    eeprom_buf, buf_size, read);
 	} else if (limit <= EEPROM_OFFSET_SIZE) {
 		dev_err_ratelimited(&i2c_adap->dev,
@@ -157,8 +212,7 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
 		      buf_size -= ps, eeprom_addr += ps, eeprom_buf += ps) {
 			ps = min(limit, buf_size);
 
-			r = __amdgpu_eeprom_xfer(i2c_adap,
-						 slave_addr, eeprom_addr,
+			r = __amdgpu_eeprom_xfer(i2c_adap, eeprom_addr,
 						 eeprom_buf, ps, read);
 			if (r < 0)
 				return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
index 9301e5678910ad..417472be2712e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
@@ -26,9 +26,7 @@
 
 #include <linux/i2c.h>
 
-int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
-		       u16 slave_addr, u16 eeprom_addr,
+int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
 		       u8 *eeprom_buf, u16 bytes, bool read);
 
-
 #endif
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 22/40] drm/amdgpu: RAS and FRU now use 19-bit I2C address
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (20 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 21/40] drm/amdgpu: I2C EEPROM full memory addressing Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 20:59   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 23/40] drm/amdgpu: Fix wrap-around bugs in RAS Luben Tuikov
                   ` (17 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, John Clements, Jean Delvare, Hawking Zhang

Convert RAS and FRU code to use the 19-bit I2C
memory address and remove all "slave_addr", as
this is now absolved into the 19-bit address.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: John Clements <john.clements@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c    | 19 ++---
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 82 +++++++------------
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  2 +-
 3 files changed, 39 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 2b854bc6ae34bb..69b9559f840ac3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -29,8 +29,8 @@
 #include "amdgpu_fru_eeprom.h"
 #include "amdgpu_eeprom.h"
 
-#define I2C_PRODUCT_INFO_ADDR		0x56
-#define I2C_PRODUCT_INFO_OFFSET		0xC0
+#define FRU_EEPROM_MADDR        0x60000
+#define I2C_PRODUCT_INFO_OFFSET 0xC0
 
 static bool is_fru_eeprom_supported(struct amdgpu_device *adev)
 {
@@ -62,12 +62,11 @@ static bool is_fru_eeprom_supported(struct amdgpu_device *adev)
 }
 
 static int amdgpu_fru_read_eeprom(struct amdgpu_device *adev, uint32_t addrptr,
-			   unsigned char *buff)
+				  unsigned char *buff)
 {
 	int ret, size;
 
-	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, I2C_PRODUCT_INFO_ADDR,
-				 addrptr, buff, 1, true);
+	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, addrptr, buff, 1, true);
 	if (ret < 1) {
 		DRM_WARN("FRU: Failed to get size field");
 		return ret;
@@ -78,8 +77,8 @@ static int amdgpu_fru_read_eeprom(struct amdgpu_device *adev, uint32_t addrptr,
 	 */
 	size = buff[0] - I2C_PRODUCT_INFO_OFFSET;
 
-	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, I2C_PRODUCT_INFO_ADDR,
-				 addrptr + 1, buff, size, true);
+	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, addrptr + 1, buff, size,
+				 true);
 	if (ret < 1) {
 		DRM_WARN("FRU: Failed to get data field");
 		return ret;
@@ -91,8 +90,8 @@ static int amdgpu_fru_read_eeprom(struct amdgpu_device *adev, uint32_t addrptr,
 int amdgpu_fru_get_product_info(struct amdgpu_device *adev)
 {
 	unsigned char buff[34];
-	int addrptr, size;
-	int len;
+	u32 addrptr;
+	int size, len;
 
 	if (!is_fru_eeprom_supported(adev))
 		return 0;
@@ -115,7 +114,7 @@ int amdgpu_fru_get_product_info(struct amdgpu_device *adev)
 	 * Bytes 8-a are all 1-byte and refer to the size of the entire struct,
 	 * and the language field, so just start from 0xb, manufacturer size
 	 */
-	addrptr = 0xb;
+	addrptr = FRU_EEPROM_MADDR + 0xb;
 	size = amdgpu_fru_read_eeprom(adev, addrptr, buff);
 	if (size < 1) {
 		DRM_ERROR("Failed to read FRU Manufacturer, ret:%d", size);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 2b981e96ce5b9e..f316fb11b16d9e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -28,11 +28,11 @@
 #include "atom.h"
 #include "amdgpu_eeprom.h"
 
-#define EEPROM_I2C_TARGET_ADDR_VEGA20		0x50
-#define EEPROM_I2C_TARGET_ADDR_ARCTURUS		0x54
-#define EEPROM_I2C_TARGET_ADDR_ARCTURUS_D342	0x50
-#define EEPROM_I2C_TARGET_ADDR_SIENNA_CICHLID   0x50
-#define EEPROM_I2C_TARGET_ADDR_ALDEBARAN        0x50	       
+#define EEPROM_I2C_MADDR_VEGA20         0x0
+#define EEPROM_I2C_MADDR_ARCTURUS       0x40000
+#define EEPROM_I2C_MADDR_ARCTURUS_D342  0x0
+#define EEPROM_I2C_MADDR_SIENNA_CICHLID 0x0
+#define EEPROM_I2C_MADDR_ALDEBARAN      0x0
 
 /*
  * The 2 macros bellow represent the actual size in bytes that
@@ -58,7 +58,6 @@
 #define EEPROM_HDR_START 0
 #define EEPROM_RECORD_START (EEPROM_HDR_START + EEPROM_TABLE_HEADER_SIZE)
 #define EEPROM_MAX_RECORD_NUM ((EEPROM_SIZE_BYTES - EEPROM_TABLE_HEADER_SIZE) / EEPROM_TABLE_RECORD_SIZE)
-#define EEPROM_ADDR_MSB_MASK GENMASK(17, 8)
 
 #define to_amdgpu_device(x) (container_of(x, struct amdgpu_ras, eeprom_control))->adev
 
@@ -74,43 +73,43 @@ static bool __is_ras_eeprom_supported(struct amdgpu_device *adev)
 }
 
 static bool __get_eeprom_i2c_addr_arct(struct amdgpu_device *adev,
-				       uint16_t *i2c_addr)
+				       struct amdgpu_ras_eeprom_control *control)
 {
 	struct atom_context *atom_ctx = adev->mode_info.atom_context;
 
-	if (!i2c_addr || !atom_ctx)
+	if (!control || !atom_ctx)
 		return false;
 
 	if (strnstr(atom_ctx->vbios_version,
 	            "D342",
 		    sizeof(atom_ctx->vbios_version)))
-		*i2c_addr = EEPROM_I2C_TARGET_ADDR_ARCTURUS_D342;
+		control->i2c_address = EEPROM_I2C_MADDR_ARCTURUS_D342;
 	else
-		*i2c_addr = EEPROM_I2C_TARGET_ADDR_ARCTURUS;
+		control->i2c_address = EEPROM_I2C_MADDR_ARCTURUS;
 
 	return true;
 }
 
 static bool __get_eeprom_i2c_addr(struct amdgpu_device *adev,
-				  uint16_t *i2c_addr)
+				  struct amdgpu_ras_eeprom_control *control)
 {
-	if (!i2c_addr)
+	if (!control)
 		return false;
 
 	switch (adev->asic_type) {
 	case CHIP_VEGA20:
-		*i2c_addr = EEPROM_I2C_TARGET_ADDR_VEGA20;
+		control->i2c_address = EEPROM_I2C_MADDR_VEGA20;
 		break;
 
 	case CHIP_ARCTURUS:
-		return __get_eeprom_i2c_addr_arct(adev, i2c_addr);
+		return __get_eeprom_i2c_addr_arct(adev, control);
 
 	case CHIP_SIENNA_CICHLID:
-		*i2c_addr = EEPROM_I2C_TARGET_ADDR_SIENNA_CICHLID;
+		control->i2c_address = EEPROM_I2C_MADDR_SIENNA_CICHLID;
 		break;
 
 	case CHIP_ALDEBARAN:
-		*i2c_addr = EEPROM_I2C_TARGET_ADDR_ALDEBARAN;
+		control->i2c_address = EEPROM_I2C_MADDR_ALDEBARAN;
 		break;
 
 	default:
@@ -154,8 +153,9 @@ static int __update_table_header(struct amdgpu_ras_eeprom_control *control,
 
 	/* i2c may be unstable in gpu reset */
 	down_read(&adev->reset_sem);
-	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, control->i2c_address,
-				 EEPROM_HDR_START, buff, EEPROM_TABLE_HEADER_SIZE, false);
+	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
+				 control->i2c_address + EEPROM_HDR_START,
+				 buff, EEPROM_TABLE_HEADER_SIZE, false);
 	up_read(&adev->reset_sem);
 
 	if (ret < 1)
@@ -277,7 +277,7 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control)
 }
 
 int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
-			bool *exceed_err_limit)
+			   bool *exceed_err_limit)
 {
 	int ret = 0;
 	struct amdgpu_device *adev = to_amdgpu_device(control);
@@ -294,14 +294,15 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
 	if (!adev->pm.smu_i2c.algo)
 		return -ENOENT;
 
-	if (!__get_eeprom_i2c_addr(adev, &control->i2c_address))
+	if (!__get_eeprom_i2c_addr(adev, control))
 		return -EINVAL;
 
 	mutex_init(&control->tbl_mutex);
 
 	/* Read/Create table header from EEPROM address 0 */
-	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, control->i2c_address,
-				 EEPROM_HDR_START, buff, EEPROM_TABLE_HEADER_SIZE, true);
+	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
+				 control->i2c_address + EEPROM_HDR_START,
+				 buff, EEPROM_TABLE_HEADER_SIZE, true);
 	if (ret < 1) {
 		DRM_ERROR("Failed to read EEPROM table header, ret:%d", ret);
 		return ret;
@@ -395,8 +396,6 @@ static void __decode_table_record_from_buff(struct amdgpu_ras_eeprom_control *co
 
 /*
  * When reaching end of EEPROM memory jump back to 0 record address
- * When next record access will go beyond EEPROM page boundary modify bits A17/A8
- * in I2C selector to go to next page
  */
 static uint32_t __correct_eeprom_dest_address(uint32_t curr_address)
 {
@@ -409,20 +408,6 @@ static uint32_t __correct_eeprom_dest_address(uint32_t curr_address)
 		return EEPROM_RECORD_START;
 	}
 
-	/*
-	 * To check if we overflow page boundary  compare next address with
-	 * current and see if bits 17/8 of the EEPROM address will change
-	 * If they do start from the next 256b page
-	 *
-	 * https://www.st.com/resource/en/datasheet/m24m02-dr.pdf sec. 5.1.2
-	 */
-	if ((curr_address & EEPROM_ADDR_MSB_MASK) != (next_address & EEPROM_ADDR_MSB_MASK)) {
-		DRM_DEBUG_DRIVER("Reached end of EEPROM memory page, jumping to next: %lx",
-				(next_address & EEPROM_ADDR_MSB_MASK));
-
-		return  (next_address & EEPROM_ADDR_MSB_MASK);
-	}
-
 	return curr_address;
 }
 
@@ -452,22 +437,20 @@ bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev)
 }
 
 int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
-					    struct eeprom_table_record *records,
-					    bool write,
-					    int num)
+				     struct eeprom_table_record *records,
+				     bool write, int num)
 {
 	int i, ret = 0;
 	unsigned char *buffs, *buff;
 	struct eeprom_table_record *record;
 	struct amdgpu_device *adev = to_amdgpu_device(control);
 	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
-	u16 slave_addr;
 
 	if (!__is_ras_eeprom_supported(adev))
 		return 0;
 
 	buffs = kcalloc(num, EEPROM_ADDRESS_SIZE + EEPROM_TABLE_RECORD_SIZE,
-			 GFP_KERNEL);
+			GFP_KERNEL);
 	if (!buffs)
 		return -ENOMEM;
 
@@ -507,22 +490,15 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 
 		control->next_addr = __correct_eeprom_dest_address(control->next_addr);
 
-		/*
-		 * Update bits 16,17 of EEPROM address in I2C address by setting them
-		 * to bits 1,2 of Device address byte
-		 */
-		slave_addr = control->i2c_address |
-			((control->next_addr & EEPROM_ADDR_MSB_MASK) >> 15);
-
 		/* EEPROM table content is stored in LE format */
 		if (write)
 			__encode_table_record_to_buff(control, record, buff);
 
 		/* i2c may be unstable in gpu reset */
 		down_read(&adev->reset_sem);
-		ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, slave_addr,
-					 control->next_addr, buff,
-					 EEPROM_TABLE_RECORD_SIZE, write ? false : true);
+		ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
+					 control->i2c_address + control->next_addr,
+					 buff, EEPROM_TABLE_RECORD_SIZE, !write);
 		up_read(&adev->reset_sem);
 
 		if (ret < 1) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
index 17872117097455..4c4c3d840a35c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
@@ -44,11 +44,11 @@ struct amdgpu_ras_eeprom_table_header {
 
 struct amdgpu_ras_eeprom_control {
 	struct amdgpu_ras_eeprom_table_header tbl_hdr;
+	u32 i2c_address; /* Base I2C 19-bit memory address */
 	uint32_t next_addr;
 	unsigned int num_recs;
 	struct mutex tbl_mutex;
 	uint32_t tbl_byte_sum;
-	uint16_t i2c_address; // 8-bit represented address
 };
 
 /*
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 23/40] drm/amdgpu: Fix wrap-around bugs in RAS
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (21 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 22/40] drm/amdgpu: RAS and FRU now use 19-bit I2C address Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:00   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 24/40] drm/amdgpu: I2C class is HWMON Luben Tuikov
                   ` (16 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

Fix the size of the EEPROM from 256000 bytes
to 262144 bytes (256 KiB).

Fix a couple or wrap around bugs. If a valid
value/address is 0 <= addr < size, the inverse of
this inequality (barring negative values which
make no sense here) is addr >= size. Fix this in
the RAS code.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 20 +++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index f316fb11b16d9e..3ef38b90fc3a83 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -52,12 +52,11 @@
 /* Bad GPU tag ‘BADG’ */
 #define EEPROM_TABLE_HDR_BAD 0x42414447
 
-/* Assume 2 Mbit size */
-#define EEPROM_SIZE_BYTES 256000
-#define EEPROM_PAGE__SIZE_BYTES 256
-#define EEPROM_HDR_START 0
-#define EEPROM_RECORD_START (EEPROM_HDR_START + EEPROM_TABLE_HEADER_SIZE)
-#define EEPROM_MAX_RECORD_NUM ((EEPROM_SIZE_BYTES - EEPROM_TABLE_HEADER_SIZE) / EEPROM_TABLE_RECORD_SIZE)
+/* Assume 2-Mbit size */
+#define EEPROM_SIZE_BYTES       (256 * 1024)
+#define EEPROM_HDR_START        0
+#define EEPROM_RECORD_START     (EEPROM_HDR_START + EEPROM_TABLE_HEADER_SIZE)
+#define EEPROM_MAX_RECORD_NUM   ((EEPROM_SIZE_BYTES - EEPROM_TABLE_HEADER_SIZE) / EEPROM_TABLE_RECORD_SIZE)
 
 #define to_amdgpu_device(x) (container_of(x, struct amdgpu_ras, eeprom_control))->adev
 
@@ -402,9 +401,8 @@ static uint32_t __correct_eeprom_dest_address(uint32_t curr_address)
 	uint32_t next_address = curr_address + EEPROM_TABLE_RECORD_SIZE;
 
 	/* When all EEPROM memory used jump back to 0 address */
-	if (next_address > EEPROM_SIZE_BYTES) {
-		DRM_INFO("Reached end of EEPROM memory, jumping to 0 "
-			 "and overriding old record");
+	if (next_address >= EEPROM_SIZE_BYTES) {
+		DRM_INFO("Reached end of EEPROM memory, wrap around to 0.");
 		return EEPROM_RECORD_START;
 	}
 
@@ -476,7 +474,9 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 	}
 
 	/* In case of overflow just start from beginning to not lose newest records */
-	if (write && (control->next_addr + EEPROM_TABLE_RECORD_SIZE * num > EEPROM_SIZE_BYTES))
+	if (write &&
+	    (control->next_addr +
+	     EEPROM_TABLE_RECORD_SIZE * num >= EEPROM_SIZE_BYTES))
 		control->next_addr = EEPROM_RECORD_START;
 
 	/*
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 24/40] drm/amdgpu: I2C class is HWMON
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (22 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 23/40] drm/amdgpu: Fix wrap-around bugs in RAS Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:02   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 25/40] drm/amdgpu: RAS: EEPROM --> RAS Luben Tuikov
                   ` (15 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

Set the auto-discoverable class of I2C bus to
HWMON. Remove SPD.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c              | 2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
index b8d6d308fb06a0..e403ba556e5590 100644
--- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
@@ -667,7 +667,7 @@ int smu_v11_0_i2c_control_init(struct i2c_adapter *control)
 
 	mutex_init(&adev->pm.smu_i2c_mutex);
 	control->owner = THIS_MODULE;
-	control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
+	control->class = I2C_CLASS_HWMON;
 	control->dev.parent = &adev->pdev->dev;
 	control->algo = &smu_v11_0_i2c_algo;
 	snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index c2d6d7c8129593..974740ac72fded 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -2016,7 +2016,7 @@ static int arcturus_i2c_control_init(struct smu_context *smu, struct i2c_adapter
 	int res;
 
 	control->owner = THIS_MODULE;
-	control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
+	control->class = I2C_CLASS_HWMON;
 	control->dev.parent = &adev->pdev->dev;
 	control->algo = &arcturus_i2c_algo;
 	control->quirks = &arcturus_i2c_control_quirks;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 56000463f64e45..8ab06fa87edb04 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2810,7 +2810,7 @@ static int navi10_i2c_control_init(struct smu_context *smu, struct i2c_adapter *
 	int res;
 
 	control->owner = THIS_MODULE;
-	control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
+	control->class = I2C_CLASS_HWMON;
 	control->dev.parent = &adev->pdev->dev;
 	control->algo = &navi10_i2c_algo;
 	snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 86804f3b0a951b..91614ae186f7f5 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -3498,7 +3498,7 @@ static int sienna_cichlid_i2c_control_init(struct smu_context *smu, struct i2c_a
 	int res;
 
 	control->owner = THIS_MODULE;
-	control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
+	control->class = I2C_CLASS_HWMON;
 	control->dev.parent = &adev->pdev->dev;
 	control->algo = &sienna_cichlid_i2c_algo;
 	snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 25/40] drm/amdgpu: RAS: EEPROM --> RAS
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (23 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 24/40] drm/amdgpu: I2C class is HWMON Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:03   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 26/40] drm/amdgpu: Rename misspelled function Luben Tuikov
                   ` (14 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

In amdgpu_ras_eeprom.c--the interface from RAS to
EEPROM, rename macros from EEPROM to RAS, to
indicate that the quantities and objects are RAS
specific, not EEPROM. We can decrease the RAS
table, or put it in different offset of EEPROM as
needed in the future.

Remove EEPROM_ADDRESS_SIZE macro definition, equal
to 2, from the file and calculations, as that
quantity is computed and added on the stack,
in the lower layer, amdgpu_eeprom_xfer().

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 103 +++++++++---------
 1 file changed, 50 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 3ef38b90fc3a83..d3678706bb736d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -37,26 +37,25 @@
 /*
  * The 2 macros bellow represent the actual size in bytes that
  * those entities occupy in the EEPROM memory.
- * EEPROM_TABLE_RECORD_SIZE is different than sizeof(eeprom_table_record) which
+ * RAS_TABLE_RECORD_SIZE is different than sizeof(eeprom_table_record) which
  * uses uint64 to store 6b fields such as retired_page.
  */
-#define EEPROM_TABLE_HEADER_SIZE 20
-#define EEPROM_TABLE_RECORD_SIZE 24
-
-#define EEPROM_ADDRESS_SIZE 0x2
+#define RAS_TABLE_HEADER_SIZE   20
+#define RAS_TABLE_RECORD_SIZE   24
 
 /* Table hdr is 'AMDR' */
-#define EEPROM_TABLE_HDR_VAL 0x414d4452
-#define EEPROM_TABLE_VER 0x00010000
+#define RAS_TABLE_HDR_VAL       0x414d4452
+#define RAS_TABLE_VER           0x00010000
 
 /* Bad GPU tag ‘BADG’ */
-#define EEPROM_TABLE_HDR_BAD 0x42414447
+#define RAS_TABLE_HDR_BAD       0x42414447
 
-/* Assume 2-Mbit size */
-#define EEPROM_SIZE_BYTES       (256 * 1024)
-#define EEPROM_HDR_START        0
-#define EEPROM_RECORD_START     (EEPROM_HDR_START + EEPROM_TABLE_HEADER_SIZE)
-#define EEPROM_MAX_RECORD_NUM   ((EEPROM_SIZE_BYTES - EEPROM_TABLE_HEADER_SIZE) / EEPROM_TABLE_RECORD_SIZE)
+/* Assume 2-Mbit size EEPROM and take up the whole space. */
+#define RAS_TBL_SIZE_BYTES      (256 * 1024)
+#define RAS_HDR_START           0
+#define RAS_RECORD_START        (RAS_HDR_START + RAS_TABLE_HEADER_SIZE)
+#define RAS_MAX_RECORD_NUM      ((RAS_TBL_SIZE_BYTES - RAS_TABLE_HEADER_SIZE) \
+				 / RAS_TABLE_RECORD_SIZE)
 
 #define to_amdgpu_device(x) (container_of(x, struct amdgpu_ras, eeprom_control))->adev
 
@@ -153,8 +152,8 @@ static int __update_table_header(struct amdgpu_ras_eeprom_control *control,
 	/* i2c may be unstable in gpu reset */
 	down_read(&adev->reset_sem);
 	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
-				 control->i2c_address + EEPROM_HDR_START,
-				 buff, EEPROM_TABLE_HEADER_SIZE, false);
+				 control->i2c_address + RAS_HDR_START,
+				 buff, RAS_TABLE_HEADER_SIZE, false);
 	up_read(&adev->reset_sem);
 
 	if (ret < 1)
@@ -236,11 +235,11 @@ static int amdgpu_ras_eeprom_correct_header_tag(
 				struct amdgpu_ras_eeprom_control *control,
 				uint32_t header)
 {
-	unsigned char buff[EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE];
+	unsigned char buff[RAS_TABLE_HEADER_SIZE];
 	struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr;
 	int ret = 0;
 
-	memset(buff, 0, EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE);
+	memset(buff, 0, RAS_TABLE_HEADER_SIZE);
 
 	mutex_lock(&control->tbl_mutex);
 	hdr->header = header;
@@ -252,20 +251,20 @@ static int amdgpu_ras_eeprom_correct_header_tag(
 
 int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control)
 {
-	unsigned char buff[EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE] = { 0 };
+	unsigned char buff[RAS_TABLE_HEADER_SIZE] = { 0 };
 	struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr;
 	int ret = 0;
 
 	mutex_lock(&control->tbl_mutex);
 
-	hdr->header = EEPROM_TABLE_HDR_VAL;
-	hdr->version = EEPROM_TABLE_VER;
-	hdr->first_rec_offset = EEPROM_RECORD_START;
-	hdr->tbl_size = EEPROM_TABLE_HEADER_SIZE;
+	hdr->header = RAS_TABLE_HDR_VAL;
+	hdr->version = RAS_TABLE_VER;
+	hdr->first_rec_offset = RAS_RECORD_START;
+	hdr->tbl_size = RAS_TABLE_HEADER_SIZE;
 
 	control->tbl_byte_sum = 0;
 	__update_tbl_checksum(control, NULL, 0, 0);
-	control->next_addr = EEPROM_RECORD_START;
+	control->next_addr = RAS_RECORD_START;
 
 	ret = __update_table_header(control, buff);
 
@@ -280,7 +279,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
 {
 	int ret = 0;
 	struct amdgpu_device *adev = to_amdgpu_device(control);
-	unsigned char buff[EEPROM_TABLE_HEADER_SIZE] = { 0 };
+	unsigned char buff[RAS_TABLE_HEADER_SIZE] = { 0 };
 	struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr;
 	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
 
@@ -300,8 +299,8 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
 
 	/* Read/Create table header from EEPROM address 0 */
 	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
-				 control->i2c_address + EEPROM_HDR_START,
-				 buff, EEPROM_TABLE_HEADER_SIZE, true);
+				 control->i2c_address + RAS_HDR_START,
+				 buff, RAS_TABLE_HEADER_SIZE, true);
 	if (ret < 1) {
 		DRM_ERROR("Failed to read EEPROM table header, ret:%d", ret);
 		return ret;
@@ -309,22 +308,22 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
 
 	__decode_table_header_from_buff(hdr, &buff[2]);
 
-	if (hdr->header == EEPROM_TABLE_HDR_VAL) {
-		control->num_recs = (hdr->tbl_size - EEPROM_TABLE_HEADER_SIZE) /
-				    EEPROM_TABLE_RECORD_SIZE;
+	if (hdr->header == RAS_TABLE_HDR_VAL) {
+		control->num_recs = (hdr->tbl_size - RAS_TABLE_HEADER_SIZE) /
+				    RAS_TABLE_RECORD_SIZE;
 		control->tbl_byte_sum = __calc_hdr_byte_sum(control);
-		control->next_addr = EEPROM_RECORD_START;
+		control->next_addr = RAS_RECORD_START;
 
 		DRM_DEBUG_DRIVER("Found existing EEPROM table with %d records",
 				 control->num_recs);
 
-	} else if ((hdr->header == EEPROM_TABLE_HDR_BAD) &&
+	} else if ((hdr->header == RAS_TABLE_HDR_BAD) &&
 			(amdgpu_bad_page_threshold != 0)) {
 		if (ras->bad_page_cnt_threshold > control->num_recs) {
 			dev_info(adev->dev, "Using one valid bigger bad page "
 				"threshold and correcting eeprom header tag.\n");
 			ret = amdgpu_ras_eeprom_correct_header_tag(control,
-							EEPROM_TABLE_HDR_VAL);
+							RAS_TABLE_HDR_VAL);
 		} else {
 			*exceed_err_limit = true;
 			dev_err(adev->dev, "Exceeding the bad_page_threshold parameter, "
@@ -398,12 +397,12 @@ static void __decode_table_record_from_buff(struct amdgpu_ras_eeprom_control *co
  */
 static uint32_t __correct_eeprom_dest_address(uint32_t curr_address)
 {
-	uint32_t next_address = curr_address + EEPROM_TABLE_RECORD_SIZE;
+	u32 next_address = curr_address + RAS_TABLE_RECORD_SIZE;
 
 	/* When all EEPROM memory used jump back to 0 address */
-	if (next_address >= EEPROM_SIZE_BYTES) {
+	if (next_address >= RAS_TBL_SIZE_BYTES) {
 		DRM_INFO("Reached end of EEPROM memory, wrap around to 0.");
-		return EEPROM_RECORD_START;
+		return RAS_RECORD_START;
 	}
 
 	return curr_address;
@@ -411,7 +410,6 @@ static uint32_t __correct_eeprom_dest_address(uint32_t curr_address)
 
 bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev)
 {
-
 	struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
 
 	if (!__is_ras_eeprom_supported(adev))
@@ -424,7 +422,7 @@ bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev)
 		if (!(con->features & BIT(AMDGPU_RAS_BLOCK__UMC)))
 			return false;
 
-	if (con->eeprom_control.tbl_hdr.header == EEPROM_TABLE_HDR_BAD) {
+	if (con->eeprom_control.tbl_hdr.header == RAS_TABLE_HDR_BAD) {
 		dev_warn(adev->dev, "This GPU is in BAD status.");
 		dev_warn(adev->dev, "Please retire it or setting one bigger "
 				"threshold value when reloading driver.\n");
@@ -447,8 +445,7 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 	if (!__is_ras_eeprom_supported(adev))
 		return 0;
 
-	buffs = kcalloc(num, EEPROM_ADDRESS_SIZE + EEPROM_TABLE_RECORD_SIZE,
-			GFP_KERNEL);
+	buffs = kcalloc(num, RAS_TABLE_RECORD_SIZE, GFP_KERNEL);
 	if (!buffs)
 		return -ENOMEM;
 
@@ -470,14 +467,14 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 		dev_warn(adev->dev,
 			"Saved bad pages(%d) reaches threshold value(%d).\n",
 			control->num_recs + num, ras->bad_page_cnt_threshold);
-		control->tbl_hdr.header = EEPROM_TABLE_HDR_BAD;
+		control->tbl_hdr.header = RAS_TABLE_HDR_BAD;
 	}
 
 	/* In case of overflow just start from beginning to not lose newest records */
 	if (write &&
 	    (control->next_addr +
-	     EEPROM_TABLE_RECORD_SIZE * num >= EEPROM_SIZE_BYTES))
-		control->next_addr = EEPROM_RECORD_START;
+	     RAS_TABLE_RECORD_SIZE * num >= RAS_TBL_SIZE_BYTES))
+		control->next_addr = RAS_RECORD_START;
 
 	/*
 	 * TODO Currently makes EEPROM writes for each record, this creates
@@ -485,7 +482,7 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 	 * 256b
 	 */
 	for (i = 0; i < num; i++) {
-		buff = &buffs[i * EEPROM_TABLE_RECORD_SIZE];
+		buff = &buffs[i * RAS_TABLE_RECORD_SIZE];
 		record = &records[i];
 
 		control->next_addr = __correct_eeprom_dest_address(control->next_addr);
@@ -498,7 +495,7 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 		down_read(&adev->reset_sem);
 		ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
 					 control->i2c_address + control->next_addr,
-					 buff, EEPROM_TABLE_RECORD_SIZE, !write);
+					 buff, RAS_TABLE_RECORD_SIZE, !write);
 		up_read(&adev->reset_sem);
 
 		if (ret < 1) {
@@ -511,12 +508,12 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 		 * The destination EEPROM address might need to be corrected to account
 		 * for page or entire memory wrapping
 		 */
-		control->next_addr += EEPROM_TABLE_RECORD_SIZE;
+		control->next_addr += RAS_TABLE_RECORD_SIZE;
 	}
 
 	if (!write) {
 		for (i = 0; i < num; i++) {
-			buff = &buffs[i*EEPROM_TABLE_RECORD_SIZE];
+			buff = &buffs[i * RAS_TABLE_RECORD_SIZE];
 			record = &records[i];
 
 			__decode_table_record_from_buff(control, record, buff);
@@ -534,11 +531,11 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 		 * TODO - Check the assumption is correct
 		 */
 		control->num_recs += num;
-		control->num_recs %= EEPROM_MAX_RECORD_NUM;
-		control->tbl_hdr.tbl_size += EEPROM_TABLE_RECORD_SIZE * num;
-		if (control->tbl_hdr.tbl_size > EEPROM_SIZE_BYTES)
-			control->tbl_hdr.tbl_size = EEPROM_TABLE_HEADER_SIZE +
-			control->num_recs * EEPROM_TABLE_RECORD_SIZE;
+		control->num_recs %= RAS_MAX_RECORD_NUM;
+		control->tbl_hdr.tbl_size += RAS_TABLE_RECORD_SIZE * num;
+		if (control->tbl_hdr.tbl_size > RAS_TBL_SIZE_BYTES)
+			control->tbl_hdr.tbl_size = RAS_TABLE_HEADER_SIZE +
+			control->num_recs * RAS_TABLE_RECORD_SIZE;
 
 		__update_tbl_checksum(control, records, num, old_hdr_byte_sum);
 
@@ -559,7 +556,7 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
 
 inline uint32_t amdgpu_ras_eeprom_get_record_max_length(void)
 {
-	return EEPROM_MAX_RECORD_NUM;
+	return RAS_MAX_RECORD_NUM;
 }
 
 /* Used for testing if bugs encountered */
@@ -581,7 +578,7 @@ void amdgpu_ras_eeprom_test(struct amdgpu_ras_eeprom_control *control)
 
 		memset(recs, 0, sizeof(*recs) * 1);
 
-		control->next_addr = EEPROM_RECORD_START;
+		control->next_addr = RAS_RECORD_START;
 
 		if (!amdgpu_ras_eeprom_process_recods(control, recs, false, 1)) {
 			for (i = 0; i < 1; i++)
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 26/40] drm/amdgpu: Rename misspelled function
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (24 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 25/40] drm/amdgpu: RAS: EEPROM --> RAS Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:04   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 27/40] drm/amdgpu: RAS xfer to read/write Luben Tuikov
                   ` (13 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

Instead of fixing the spelling in
  amdgpu_ras_eeprom_process_recods(),
rename it to,
  amdgpu_ras_eeprom_xfer(),
to look similar to other I2C and protocol
transfer (read/write) functions.

Also to keep the column span to within reason by
using a shorter name.

Change the "num" function parameter from "int" to
"const u32" since it is the number of items
(records) to xfer, i.e. their count, which cannot
be a negative number.

Also swap the order of parameters, keeping the
pointer to records and their number next to each
other, while the direction now becomes the last
parameter.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c        | 11 +++++------
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 10 +++++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h |  7 +++----
 3 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index ec936cde272602..beaa1fee7f71f3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1817,10 +1817,10 @@ int amdgpu_ras_save_bad_pages(struct amdgpu_device *adev)
 	save_count = data->count - control->num_recs;
 	/* only new entries are saved */
 	if (save_count > 0) {
-		if (amdgpu_ras_eeprom_process_recods(control,
-							&data->bps[control->num_recs],
-							true,
-							save_count)) {
+		if (amdgpu_ras_eeprom_xfer(control,
+					   &data->bps[control->num_recs],
+					   save_count,
+					   true)) {
 			dev_err(adev->dev, "Failed to save EEPROM table data!");
 			return -EIO;
 		}
@@ -1850,8 +1850,7 @@ static int amdgpu_ras_load_bad_pages(struct amdgpu_device *adev)
 	if (!bps)
 		return -ENOMEM;
 
-	if (amdgpu_ras_eeprom_process_recods(control, bps, false,
-		control->num_recs)) {
+	if (amdgpu_ras_eeprom_xfer(control, bps, control->num_recs, false)) {
 		dev_err(adev->dev, "Failed to load EEPROM table records!");
 		ret = -EIO;
 		goto out;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index d3678706bb736d..9e3fbc44b4bc4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -432,9 +432,9 @@ bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev)
 	return false;
 }
 
-int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
-				     struct eeprom_table_record *records,
-				     bool write, int num)
+int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
+			   struct eeprom_table_record *records,
+			   const u32 num, bool write)
 {
 	int i, ret = 0;
 	unsigned char *buffs, *buff;
@@ -574,13 +574,13 @@ void amdgpu_ras_eeprom_test(struct amdgpu_ras_eeprom_control *control)
 		recs[i].retired_page = i;
 	}
 
-	if (!amdgpu_ras_eeprom_process_recods(control, recs, true, 1)) {
+	if (!amdgpu_ras_eeprom_xfer(control, recs, 1, true)) {
 
 		memset(recs, 0, sizeof(*recs) * 1);
 
 		control->next_addr = RAS_RECORD_START;
 
-		if (!amdgpu_ras_eeprom_process_recods(control, recs, false, 1)) {
+		if (!amdgpu_ras_eeprom_xfer(control, recs, 1, false)) {
 			for (i = 0; i < 1; i++)
 				DRM_INFO("rec.address :0x%llx, rec.retired_page :%llu",
 					 recs[i].address, recs[i].retired_page);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
index 4c4c3d840a35c5..6a1bd527bce57a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
@@ -82,10 +82,9 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control);
 
 bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev);
 
-int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
-					    struct eeprom_table_record *records,
-					    bool write,
-					    int num);
+int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
+			   struct eeprom_table_record *records,
+			   const u32 num, bool write);
 
 inline uint32_t amdgpu_ras_eeprom_get_record_max_length(void);
 
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 27/40] drm/amdgpu: RAS xfer to read/write
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (25 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 26/40] drm/amdgpu: Rename misspelled function Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:05   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 28/40] drm/amdgpu: EEPROM: add explicit read and write Luben Tuikov
                   ` (12 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

Wrap amdgpu_ras_eeprom_xfer(..., bool write),
into amdgpu_ras_eeprom_read() and
amdgpu_ras_eeprom_write(), as that makes reading
and understanding the code clearer.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |  9 ++++---
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 24 +++++++++++++++----
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  8 ++++---
 3 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index beaa1fee7f71f3..e3ad081eddd40b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1817,10 +1817,9 @@ int amdgpu_ras_save_bad_pages(struct amdgpu_device *adev)
 	save_count = data->count - control->num_recs;
 	/* only new entries are saved */
 	if (save_count > 0) {
-		if (amdgpu_ras_eeprom_xfer(control,
-					   &data->bps[control->num_recs],
-					   save_count,
-					   true)) {
+		if (amdgpu_ras_eeprom_write(control,
+					    &data->bps[control->num_recs],
+					    save_count)) {
 			dev_err(adev->dev, "Failed to save EEPROM table data!");
 			return -EIO;
 		}
@@ -1850,7 +1849,7 @@ static int amdgpu_ras_load_bad_pages(struct amdgpu_device *adev)
 	if (!bps)
 		return -ENOMEM;
 
-	if (amdgpu_ras_eeprom_xfer(control, bps, control->num_recs, false)) {
+	if (amdgpu_ras_eeprom_read(control, bps, control->num_recs)) {
 		dev_err(adev->dev, "Failed to load EEPROM table records!");
 		ret = -EIO;
 		goto out;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 9e3fbc44b4bc4a..550a31953d2da1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -432,9 +432,9 @@ bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev)
 	return false;
 }
 
-int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
-			   struct eeprom_table_record *records,
-			   const u32 num, bool write)
+static int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
+				  struct eeprom_table_record *records,
+				  const u32 num, bool write)
 {
 	int i, ret = 0;
 	unsigned char *buffs, *buff;
@@ -554,6 +554,20 @@ int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
 	return ret == num ? 0 : -EIO;
 }
 
+int amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control *control,
+			   struct eeprom_table_record *records,
+			   const u32 num)
+{
+	return amdgpu_ras_eeprom_xfer(control, records, num, false);
+}
+
+int amdgpu_ras_eeprom_write(struct amdgpu_ras_eeprom_control *control,
+			    struct eeprom_table_record *records,
+			    const u32 num)
+{
+	return amdgpu_ras_eeprom_xfer(control, records, num, true);
+}
+
 inline uint32_t amdgpu_ras_eeprom_get_record_max_length(void)
 {
 	return RAS_MAX_RECORD_NUM;
@@ -574,13 +588,13 @@ void amdgpu_ras_eeprom_test(struct amdgpu_ras_eeprom_control *control)
 		recs[i].retired_page = i;
 	}
 
-	if (!amdgpu_ras_eeprom_xfer(control, recs, 1, true)) {
+	if (!amdgpu_ras_eeprom_write(control, recs, 1)) {
 
 		memset(recs, 0, sizeof(*recs) * 1);
 
 		control->next_addr = RAS_RECORD_START;
 
-		if (!amdgpu_ras_eeprom_xfer(control, recs, 1, false)) {
+		if (!amdgpu_ras_eeprom_read(control, recs)) {
 			for (i = 0; i < 1; i++)
 				DRM_INFO("rec.address :0x%llx, rec.retired_page :%llu",
 					 recs[i].address, recs[i].retired_page);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
index 6a1bd527bce57a..fa9c509a8e2f2b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
@@ -82,9 +82,11 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control);
 
 bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev);
 
-int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
-			   struct eeprom_table_record *records,
-			   const u32 num, bool write);
+int amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control *control,
+			   struct eeprom_table_record *records, const u32 num);
+
+int amdgpu_ras_eeprom_write(struct amdgpu_ras_eeprom_control *control,
+			    struct eeprom_table_record *records, const u32 num);
 
 inline uint32_t amdgpu_ras_eeprom_get_record_max_length(void);
 
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 28/40] drm/amdgpu: EEPROM: add explicit read and write
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (26 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 27/40] drm/amdgpu: RAS xfer to read/write Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:06   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 29/40] drm/amd/pm: Extend the I2C quirk table Luben Tuikov
                   ` (11 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

Add explicit amdgpu_eeprom_read() and
amdgpu_eeprom_write() for clarity.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h     | 16 ++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c |  5 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 10 +++++-----
 3 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
index 417472be2712e6..966b434f0de2b7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
@@ -29,4 +29,20 @@
 int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
 		       u8 *eeprom_buf, u16 bytes, bool read);
 
+static inline int amdgpu_eeprom_read(struct i2c_adapter *i2c_adap,
+				     u32 eeprom_addr, u8 *eeprom_buf,
+				     u16 bytes)
+{
+	return amdgpu_eeprom_xfer(i2c_adap, eeprom_addr, eeprom_buf, bytes,
+				  true);
+}
+
+static inline int amdgpu_eeprom_write(struct i2c_adapter *i2c_adap,
+				      u32 eeprom_addr, u8 *eeprom_buf,
+				      u16 bytes)
+{
+	return amdgpu_eeprom_xfer(i2c_adap, eeprom_addr, eeprom_buf, bytes,
+				  false);
+}
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 69b9559f840ac3..7709caeb233d67 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -66,7 +66,7 @@ static int amdgpu_fru_read_eeprom(struct amdgpu_device *adev, uint32_t addrptr,
 {
 	int ret, size;
 
-	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, addrptr, buff, 1, true);
+	ret = amdgpu_eeprom_read(&adev->pm.smu_i2c, addrptr, buff, 1);
 	if (ret < 1) {
 		DRM_WARN("FRU: Failed to get size field");
 		return ret;
@@ -77,8 +77,7 @@ static int amdgpu_fru_read_eeprom(struct amdgpu_device *adev, uint32_t addrptr,
 	 */
 	size = buff[0] - I2C_PRODUCT_INFO_OFFSET;
 
-	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, addrptr + 1, buff, size,
-				 true);
+	ret = amdgpu_eeprom_read(&adev->pm.smu_i2c, addrptr + 1, buff, size);
 	if (ret < 1) {
 		DRM_WARN("FRU: Failed to get data field");
 		return ret;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 550a31953d2da1..17cea35275e46c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -151,9 +151,9 @@ static int __update_table_header(struct amdgpu_ras_eeprom_control *control,
 
 	/* i2c may be unstable in gpu reset */
 	down_read(&adev->reset_sem);
-	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
-				 control->i2c_address + RAS_HDR_START,
-				 buff, RAS_TABLE_HEADER_SIZE, false);
+	ret = amdgpu_eeprom_write(&adev->pm.smu_i2c,
+				  control->i2c_address + RAS_HDR_START,
+				  buff, RAS_TABLE_HEADER_SIZE);
 	up_read(&adev->reset_sem);
 
 	if (ret < 1)
@@ -298,9 +298,9 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
 	mutex_init(&control->tbl_mutex);
 
 	/* Read/Create table header from EEPROM address 0 */
-	ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
+	ret = amdgpu_eeprom_read(&adev->pm.smu_i2c,
 				 control->i2c_address + RAS_HDR_START,
-				 buff, RAS_TABLE_HEADER_SIZE, true);
+				 buff, RAS_TABLE_HEADER_SIZE);
 	if (ret < 1) {
 		DRM_ERROR("Failed to read EEPROM table header, ret:%d", ret);
 		return ret;
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 29/40] drm/amd/pm: Extend the I2C quirk table
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (27 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 28/40] drm/amdgpu: EEPROM: add explicit read and write Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:07   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 30/40] drm/amd/pm: Simplify managed I2C transfer functions Luben Tuikov
                   ` (10 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

Extend the I2C quirk table for SMU access
controlled I2C adapters. Let the kernel I2C layer
check that the messages all have the same address,
and that their combined size doesn't exceed the
maximum size of a SMU software I2C request.

Suggested-by: Jean Delvare <jdelvare@suse.de>
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 5 ++++-
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 5 ++++-
 drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 5 ++++-
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index 974740ac72fded..de8d7513042966 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -2006,8 +2006,11 @@ static const struct i2c_algorithm arcturus_i2c_algo = {
 
 
 static const struct i2c_adapter_quirks arcturus_i2c_control_quirks = {
-	.max_read_len = MAX_SW_I2C_COMMANDS,
+	.flags = I2C_AQ_COMB | I2C_AQ_COMB_SAME_ADDR,
+	.max_read_len  = MAX_SW_I2C_COMMANDS,
 	.max_write_len = MAX_SW_I2C_COMMANDS,
+	.max_comb_1st_msg_len = 2,
+	.max_comb_2nd_msg_len = MAX_SW_I2C_COMMANDS - 2,
 };
 
 static int arcturus_i2c_control_init(struct smu_context *smu, struct i2c_adapter *control)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 8ab06fa87edb04..1b8cd3746d0ebc 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2800,8 +2800,11 @@ static const struct i2c_algorithm navi10_i2c_algo = {
 };
 
 static const struct i2c_adapter_quirks navi10_i2c_control_quirks = {
-	.max_read_len = MAX_SW_I2C_COMMANDS,
+	.flags = I2C_AQ_COMB | I2C_AQ_COMB_SAME_ADDR,
+	.max_read_len  = MAX_SW_I2C_COMMANDS,
 	.max_write_len = MAX_SW_I2C_COMMANDS,
+	.max_comb_1st_msg_len = 2,
+	.max_comb_2nd_msg_len = MAX_SW_I2C_COMMANDS - 2,
 };
 
 static int navi10_i2c_control_init(struct smu_context *smu, struct i2c_adapter *control)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 91614ae186f7f5..b38127f8009d3d 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -3488,8 +3488,11 @@ static const struct i2c_algorithm sienna_cichlid_i2c_algo = {
 };
 
 static const struct i2c_adapter_quirks sienna_cichlid_i2c_control_quirks = {
-	.max_read_len = MAX_SW_I2C_COMMANDS,
+	.flags = I2C_AQ_COMB | I2C_AQ_COMB_SAME_ADDR,
+	.max_read_len  = MAX_SW_I2C_COMMANDS,
 	.max_write_len = MAX_SW_I2C_COMMANDS,
+	.max_comb_1st_msg_len = 2,
+	.max_comb_2nd_msg_len = MAX_SW_I2C_COMMANDS - 2,
 };
 
 static int sienna_cichlid_i2c_control_init(struct smu_context *smu, struct i2c_adapter *control)
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 30/40] drm/amd/pm: Simplify managed I2C transfer functions
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (28 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 29/40] drm/amd/pm: Extend the I2C quirk table Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:08   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 31/40] drm/amdgpu: Fix width of I2C address Luben Tuikov
                   ` (9 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

Now that we have an I2C quirk table for
SMU-managed I2C controllers, the I2C core does the
checks for us, so we don't need to do them, and so
simplify the managed I2C transfer functions.

Also, for Arcturus and Navi10, fix setting the
command type from "cmd->CmdConfig" to "cmd->Cmd".
The latter is what appears to be taking in
the enumeration I2C_CMD_... as an integer,
not a bit-flag.

For Sienna, the "Cmd" field seems to have been
eliminated, and command type and flags all live in
the "CmdConfig" field--this is left untouched.

Fix: Detect and add changing of direction
bit-flag, as this is necessary for the SMU to
detect the direction change in the 1-d array of
data it gets.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 78 ++++++++-----------
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 78 ++++++++-----------
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 76 ++++++++----------
 3 files changed, 95 insertions(+), 137 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index de8d7513042966..0db79a5236e1f1 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -1907,31 +1907,14 @@ static int arcturus_dpm_set_vcn_enable(struct smu_context *smu, bool enable)
 }
 
 static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
-			     struct i2c_msg *msgs, int num)
+			     struct i2c_msg *msg, int num_msgs)
 {
 	struct amdgpu_device *adev = to_amdgpu_device(i2c_adap);
 	struct smu_table_context *smu_table = &adev->smu.smu_table;
 	struct smu_table *table = &smu_table->driver_table;
 	SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
-	short available_bytes = MAX_SW_I2C_COMMANDS;
-	int i, j, r, c, num_done = 0;
-	u8 slave;
-
-	/* only support a single slave addr per transaction */
-	slave = msgs[0].addr;
-	for (i = 0; i < num; i++) {
-		if (slave != msgs[i].addr)
-			return -EINVAL;
-
-		available_bytes -= msgs[i].len;
-		if (available_bytes >= 0) {
-			num_done++;
-		} else {
-			/* This message and all the follwing won't be processed */
-			available_bytes += msgs[i].len;
-			break;
-		}
-	}
+	int i, j, r, c;
+	u16 dir;
 
 	req = kzalloc(sizeof(*req), GFP_KERNEL);
 	if (!req)
@@ -1939,33 +1922,38 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
 
 	req->I2CcontrollerPort = 1;
 	req->I2CSpeed = I2C_SPEED_FAST_400K;
-	req->SlaveAddress = slave << 1; /* 8 bit addresses */
-	req->NumCmds = MAX_SW_I2C_COMMANDS - available_bytes;;
-
-	c = 0;
-	for (i = 0; i < num_done; i++) {
-		struct i2c_msg *msg = &msgs[i];
+	req->SlaveAddress = msg[0].addr << 1; /* wants an 8-bit address */
+	dir = msg[0].flags & I2C_M_RD;
 
-		for (j = 0; j < msg->len; j++) {
-			SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
+	for (c = i = 0; i < num_msgs; i++) {
+		for (j = 0; j < msg[i].len; j++, c++) {
+			SwI2cCmd_t *cmd = &req->SwI2cCmds[c];
 
 			if (!(msg[i].flags & I2C_M_RD)) {
 				/* write */
-				cmd->CmdConfig |= I2C_CMD_WRITE;
-				cmd->RegisterAddr = msg->buf[j];
+				cmd->Cmd = I2C_CMD_WRITE;
+				cmd->RegisterAddr = msg[i].buf[j];
+			}
+
+			if ((dir ^ msg[i].flags) & I2C_M_RD) {
+				/* The direction changes.
+				 */
+				dir = msg[i].flags & I2C_M_RD;
+				cmd->CmdConfig |= CMDCONFIG_RESTART_MASK;
 			}
 
+			req->NumCmds++;
+
 			/*
 			 * Insert STOP if we are at the last byte of either last
 			 * message for the transaction or the client explicitly
 			 * requires a STOP at this particular message.
 			 */
-			if ((j == msg->len -1 ) &&
-			    ((i == num_done - 1) || (msg[i].flags & I2C_M_STOP)))
+			if ((j == msg[i].len - 1) &&
+			    ((i == num_msgs - 1) || (msg[i].flags & I2C_M_STOP))) {
+				cmd->CmdConfig &= ~CMDCONFIG_RESTART_MASK;
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
-
-			if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
-				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
+			}
 		}
 	}
 	mutex_lock(&adev->smu.mutex);
@@ -1974,22 +1962,20 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
 	if (r)
 		goto fail;
 
-	c = 0;
-	for (i = 0; i < num_done; i++) {
-		struct i2c_msg *msg = &msgs[i];
-
-		for (j = 0; j < msg->len; j++) {
-			SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
+	for (c = i = 0; i < num_msgs; i++) {
+		if (!(msg[i].flags & I2C_M_RD)) {
+			c += msg[i].len;
+			continue;
+		}
+		for (j = 0; j < msg[i].len; j++, c++) {
+			SwI2cCmd_t *cmd = &res->SwI2cCmds[c];
 
-			if (msg[i].flags & I2C_M_RD)
-				msg->buf[j] = cmd->Data;
+			msg[i].buf[j] = cmd->Data;
 		}
 	}
-	r = num_done;
-
+	r = num_msgs;
 fail:
 	kfree(req);
-
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 1b8cd3746d0ebc..2acf54967c6ab1 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2702,31 +2702,14 @@ static ssize_t navi10_get_legacy_gpu_metrics(struct smu_context *smu,
 }
 
 static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
-			   struct i2c_msg *msgs, int num)
+			   struct i2c_msg *msg, int num_msgs)
 {
 	struct amdgpu_device *adev = to_amdgpu_device(i2c_adap);
 	struct smu_table_context *smu_table = &adev->smu.smu_table;
 	struct smu_table *table = &smu_table->driver_table;
 	SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
-	short available_bytes = MAX_SW_I2C_COMMANDS;
-	int i, j, r, c, num_done = 0;
-	u8 slave;
-
-	/* only support a single slave addr per transaction */
-	slave = msgs[0].addr;
-	for (i = 0; i < num; i++) {
-		if (slave != msgs[i].addr)
-			return -EINVAL;
-
-		available_bytes -= msgs[i].len;
-		if (available_bytes >= 0) {
-			num_done++;
-		} else {
-			/* This message and all the follwing won't be processed */
-			available_bytes += msgs[i].len;
-			break;
-		}
-	}
+	int i, j, r, c;
+	u16 dir;
 
 	req = kzalloc(sizeof(*req), GFP_KERNEL);
 	if (!req)
@@ -2734,33 +2717,38 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
 
 	req->I2CcontrollerPort = 1;
 	req->I2CSpeed = I2C_SPEED_FAST_400K;
-	req->SlaveAddress = slave << 1; /* 8 bit addresses */
-	req->NumCmds = MAX_SW_I2C_COMMANDS - available_bytes;;
+	req->SlaveAddress = msg[0].addr << 1; /* wants an 8-bit address */
+	dir = msg[0].flags & I2C_M_RD;
 
-	c = 0;
-	for (i = 0; i < num_done; i++) {
-		struct i2c_msg *msg = &msgs[i];
-
-		for (j = 0; j < msg->len; j++) {
-			SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
+	for (c = i = 0; i < num_msgs; i++) {
+		for (j = 0; j < msg[i].len; j++, c++) {
+			SwI2cCmd_t *cmd = &req->SwI2cCmds[c];
 
 			if (!(msg[i].flags & I2C_M_RD)) {
 				/* write */
-				cmd->CmdConfig |= I2C_CMD_WRITE;
-				cmd->RegisterAddr = msg->buf[j];
+				cmd->Cmd = I2C_CMD_WRITE;
+				cmd->RegisterAddr = msg[i].buf[j];
+			}
+
+			if ((dir ^ msg[i].flags) & I2C_M_RD) {
+				/* The direction changes.
+				 */
+				dir = msg[i].flags & I2C_M_RD;
+				cmd->CmdConfig |= CMDCONFIG_RESTART_MASK;
 			}
 
+			req->NumCmds++;
+
 			/*
 			 * Insert STOP if we are at the last byte of either last
 			 * message for the transaction or the client explicitly
 			 * requires a STOP at this particular message.
 			 */
-			if ((j == msg->len -1 ) &&
-			    ((i == num_done - 1) || (msg[i].flags & I2C_M_STOP)))
+			if ((j == msg[i].len - 1) &&
+			    ((i == num_msgs - 1) || (msg[i].flags & I2C_M_STOP))) {
+				cmd->CmdConfig &= ~CMDCONFIG_RESTART_MASK;
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
-
-			if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
-				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
+			}
 		}
 	}
 	mutex_lock(&adev->smu.mutex);
@@ -2769,22 +2757,20 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
 	if (r)
 		goto fail;
 
-	c = 0;
-	for (i = 0; i < num_done; i++) {
-		struct i2c_msg *msg = &msgs[i];
-
-		for (j = 0; j < msg->len; j++) {
-			SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
+	for (c = i = 0; i < num_msgs; i++) {
+		if (!(msg[i].flags & I2C_M_RD)) {
+			c += msg[i].len;
+			continue;
+		}
+		for (j = 0; j < msg[i].len; j++, c++) {
+			SwI2cCmd_t *cmd = &res->SwI2cCmds[c];
 
-			if (msg[i].flags & I2C_M_RD)
-				msg->buf[j] = cmd->Data;
+			msg[i].buf[j] = cmd->Data;
 		}
 	}
-	r = num_done;
-
+	r = num_msgs;
 fail:
 	kfree(req);
-
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index b38127f8009d3d..44ca3b3f83f4d9 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -3390,31 +3390,14 @@ static void sienna_cichlid_dump_pptable(struct smu_context *smu)
 }
 
 static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
-				   struct i2c_msg *msgs, int num)
+				   struct i2c_msg *msg, int num_msgs)
 {
 	struct amdgpu_device *adev = to_amdgpu_device(i2c_adap);
 	struct smu_table_context *smu_table = &adev->smu.smu_table;
 	struct smu_table *table = &smu_table->driver_table;
 	SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
-	short available_bytes = MAX_SW_I2C_COMMANDS;
-	int i, j, r, c, num_done = 0;
-	u8 slave;
-
-	/* only support a single slave addr per transaction */
-	slave = msgs[0].addr;
-	for (i = 0; i < num; i++) {
-		if (slave != msgs[i].addr)
-			return -EINVAL;
-
-		available_bytes -= msgs[i].len;
-		if (available_bytes >= 0) {
-			num_done++;
-		} else {
-			/* This message and all the follwing won't be processed */
-			available_bytes += msgs[i].len;
-			break;
-		}
-	}
+	int i, j, r, c;
+	u16 dir;
 
 	req = kzalloc(sizeof(*req), GFP_KERNEL);
 	if (!req)
@@ -3422,33 +3405,38 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
 
 	req->I2CcontrollerPort = 1;
 	req->I2CSpeed = I2C_SPEED_FAST_400K;
-	req->SlaveAddress = slave << 1; /* 8 bit addresses */
-	req->NumCmds = MAX_SW_I2C_COMMANDS - available_bytes;;
+	req->SlaveAddress = msg[0].addr << 1; /* wants an 8-bit address */
+	dir = msg[0].flags & I2C_M_RD;
 
-	c = 0;
-	for (i = 0; i < num_done; i++) {
-		struct i2c_msg *msg = &msgs[i];
-
-		for (j = 0; j < msg->len; j++) {
-			SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
+	for (c = i = 0; i < num_msgs; i++) {
+		for (j = 0; j < msg[i].len; j++, c++) {
+			SwI2cCmd_t *cmd = &req->SwI2cCmds[c];
 
 			if (!(msg[i].flags & I2C_M_RD)) {
 				/* write */
 				cmd->CmdConfig |= CMDCONFIG_READWRITE_MASK;
-				cmd->ReadWriteData = msg->buf[j];
+				cmd->ReadWriteData = msg[i].buf[j];
+			}
+
+			if ((dir ^ msg[i].flags) & I2C_M_RD) {
+				/* The direction changes.
+				 */
+				dir = msg[i].flags & I2C_M_RD;
+				cmd->CmdConfig |= CMDCONFIG_RESTART_MASK;
 			}
 
+			req->NumCmds++;
+
 			/*
 			 * Insert STOP if we are at the last byte of either last
 			 * message for the transaction or the client explicitly
 			 * requires a STOP at this particular message.
 			 */
-			if ((j == msg->len -1 ) &&
-			    ((i == num_done - 1) || (msg[i].flags & I2C_M_STOP)))
+			if ((j == msg[i].len - 1) &&
+			    ((i == num_msgs - 1) || (msg[i].flags & I2C_M_STOP))) {
+				cmd->CmdConfig &= ~CMDCONFIG_RESTART_MASK;
 				cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
-
-			if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
-				cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
+			}
 		}
 	}
 	mutex_lock(&adev->smu.mutex);
@@ -3457,22 +3445,20 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
 	if (r)
 		goto fail;
 
-	c = 0;
-	for (i = 0; i < num_done; i++) {
-		struct i2c_msg *msg = &msgs[i];
-
-		for (j = 0; j < msg->len; j++) {
-			SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
+	for (c = i = 0; i < num_msgs; i++) {
+		if (!(msg[i].flags & I2C_M_RD)) {
+			c += msg[i].len;
+			continue;
+		}
+		for (j = 0; j < msg[i].len; j++, c++) {
+			SwI2cCmd_t *cmd = &res->SwI2cCmds[c];
 
-			if (msg[i].flags & I2C_M_RD)
-				msg->buf[j] = cmd->ReadWriteData;
+			msg[i].buf[j] = cmd->ReadWriteData;
 		}
 	}
-	r = num_done;
-
+	r = num_msgs;
 fail:
 	kfree(req);
-
 	return r;
 }
 
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 31/40] drm/amdgpu: Fix width of I2C address
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (29 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 30/40] drm/amd/pm: Simplify managed I2C transfer functions Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:09   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 32/40] drm/amdgpu: Return result fix in RAS Luben Tuikov
                   ` (8 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

The I2C address is kept as a 16-bit quantity in
the kernel. The I2C_TAR::I2C_TAR field is 10-bit
wide.

Fix the width of the I2C address for Vega20 from 8
bits to 16 bits to accommodate the full spectrum
of I2C address space.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
index e403ba556e5590..65035256756679 100644
--- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
@@ -111,12 +111,15 @@ static void smu_v11_0_i2c_set_clock(struct i2c_adapter *control)
 	WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_SDA_HOLD, 20);
 }
 
-static void smu_v11_0_i2c_set_address(struct i2c_adapter *control, uint8_t address)
+static void smu_v11_0_i2c_set_address(struct i2c_adapter *control, u16 address)
 {
 	struct amdgpu_device *adev = to_amdgpu_device(control);
 
-	/* We take 7-bit addresses raw */
-	WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_TAR, (address & 0xFF));
+	/* The IC_TAR::IC_TAR field is 10-bits wide.
+	 * It takes a 7-bit or 10-bit addresses as an address,
+	 * i.e. no read/write bit--no wire format, just the address.
+	 */
+	WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_TAR, address & 0x3FF);
 }
 
 static uint32_t smu_v11_0_i2c_poll_tx_status(struct i2c_adapter *control)
@@ -215,8 +218,8 @@ static uint32_t smu_v11_0_i2c_poll_rx_status(struct i2c_adapter *control)
  * Returns 0 on success or error.
  */
 static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
-				  uint8_t address, uint8_t *data,
-				  uint32_t numbytes, uint32_t i2c_flag)
+				       u16 address, u8 *data,
+				       u32 numbytes, u32 i2c_flag)
 {
 	struct amdgpu_device *adev = to_amdgpu_device(control);
 	uint32_t bytes_sent, reg, ret = 0;
@@ -225,7 +228,7 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
 	bytes_sent = 0;
 
 	DRM_DEBUG_DRIVER("I2C_Transmit(), address = %x, bytes = %d , data: ",
-		 (uint16_t)address, numbytes);
+			 address, numbytes);
 
 	if (drm_debug_enabled(DRM_UT_DRIVER)) {
 		print_hex_dump(KERN_INFO, "data: ", DUMP_PREFIX_NONE,
@@ -318,8 +321,8 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
  * Returns 0 on success or error.
  */
 static uint32_t smu_v11_0_i2c_receive(struct i2c_adapter *control,
-				 uint8_t address, uint8_t *data,
-				 uint32_t numbytes, uint8_t i2c_flag)
+				      u16 address, u8 *data,
+				      u32 numbytes, u32 i2c_flag)
 {
 	struct amdgpu_device *adev = to_amdgpu_device(control);
 	uint32_t bytes_received, ret = I2C_OK;
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 32/40] drm/amdgpu: Return result fix in RAS
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (30 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 31/40] drm/amdgpu: Fix width of I2C address Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:11   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 33/40] drm/amd/pm: Fix a bug in i2c_xfer Luben Tuikov
                   ` (7 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Lijo Lazar, Luben Tuikov, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

The low level EEPROM write method, doesn't return
1, but the number of bytes written. Thus do not
compare to 1, instead, compare to greater than 0
for success.

Other cleanup: if the lower layers returned
-errno, then return that, as opposed to
overwriting the error code with one-fits-all
-EINVAL. For instance, some return -EAGAIN.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c    |  3 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       | 22 +++++++++++--------
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    |  2 +-
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c    |  3 +--
 4 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
index a5a87affedabf1..a4815af111ed12 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
@@ -105,8 +105,7 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
 	int r;
 	u16 len;
 
-	r = 0;
-	for ( ; buf_size > 0;
+	for (r = 0; buf_size > 0;
 	      buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
 		/* Set the EEPROM address we want to write to/read from.
 		 */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index e3ad081eddd40b..66c96c65e7eeb9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -355,8 +355,9 @@ static int amdgpu_ras_debugfs_ctrl_parse_data(struct file *f,
  *	to see which blocks support RAS on a particular asic.
  *
  */
-static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f, const char __user *buf,
-		size_t size, loff_t *pos)
+static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f,
+					     const char __user *buf,
+					     size_t size, loff_t *pos)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
 	struct ras_debug_if data;
@@ -370,7 +371,7 @@ static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f, const char __user *
 
 	ret = amdgpu_ras_debugfs_ctrl_parse_data(f, buf, size, pos, &data);
 	if (ret)
-		return -EINVAL;
+		return ret;
 
 	if (data.op == 3) {
 		ret = amdgpu_reserve_page_direct(adev, data.inject.address);
@@ -439,21 +440,24 @@ static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f, const char __user *
  * will reset EEPROM table to 0 entries.
  *
  */
-static ssize_t amdgpu_ras_debugfs_eeprom_write(struct file *f, const char __user *buf,
-		size_t size, loff_t *pos)
+static ssize_t amdgpu_ras_debugfs_eeprom_write(struct file *f,
+					       const char __user *buf,
+					       size_t size, loff_t *pos)
 {
 	struct amdgpu_device *adev =
 		(struct amdgpu_device *)file_inode(f)->i_private;
 	int ret;
 
 	ret = amdgpu_ras_eeprom_reset_table(
-			&(amdgpu_ras_get_context(adev)->eeprom_control));
+		&(amdgpu_ras_get_context(adev)->eeprom_control));
 
-	if (ret == 1) {
+	if (ret > 0) {
+		/* Something was written to EEPROM.
+		 */
 		amdgpu_ras_get_context(adev)->flags = RAS_DEFAULT_FLAGS;
 		return size;
 	} else {
-		return -EIO;
+		return ret;
 	}
 }
 
@@ -1991,7 +1995,7 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
 	kfree(*data);
 	con->eh_data = NULL;
 out:
-	dev_warn(adev->dev, "Failed to initialize ras recovery!\n");
+	dev_warn(adev->dev, "Failed to initialize ras recovery! (%d)\n", ret);
 
 	/*
 	 * Except error threshold exceeding case, other failure cases in this
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 17cea35275e46c..dc48c556398039 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -335,7 +335,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
 		ret = amdgpu_ras_eeprom_reset_table(control);
 	}
 
-	return ret == 1 ? 0 : -EIO;
+	return ret > 0 ? 0 : -EIO;
 }
 
 static void __encode_table_record_to_buff(struct amdgpu_ras_eeprom_control *control,
diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
index 65035256756679..7f48ee020bc03e 100644
--- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
@@ -222,7 +222,7 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
 				       u32 numbytes, u32 i2c_flag)
 {
 	struct amdgpu_device *adev = to_amdgpu_device(control);
-	uint32_t bytes_sent, reg, ret = 0;
+	u32 bytes_sent, reg, ret = I2C_OK;
 	unsigned long  timeout_counter;
 
 	bytes_sent = 0;
@@ -290,7 +290,6 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
 	}
 
 	ret = smu_v11_0_i2c_poll_tx_status(control);
-
 Err:
 	/* Any error, no point in proceeding */
 	if (ret != I2C_OK) {
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 33/40] drm/amd/pm: Fix a bug in i2c_xfer
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (31 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 32/40] drm/amdgpu: Return result fix in RAS Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:12   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 34/40] drm/amdgpu: Fix amdgpu_ras_eeprom_init() Luben Tuikov
                   ` (6 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Luben Tuikov

"req" is now a pointer , i.e. it is no longer
allocated on the stack, thus taking its reference
and passing that is a bug.

This commit fixes this bug.

Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index 0db79a5236e1f1..7d9a2946806f58 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -1957,7 +1957,7 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
 		}
 	}
 	mutex_lock(&adev->smu.mutex);
-	r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
+	r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
 	mutex_unlock(&adev->smu.mutex);
 	if (r)
 		goto fail;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 2acf54967c6ab1..0568cbfb023459 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2752,7 +2752,7 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
 		}
 	}
 	mutex_lock(&adev->smu.mutex);
-	r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
+	r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
 	mutex_unlock(&adev->smu.mutex);
 	if (r)
 		goto fail;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 44ca3b3f83f4d9..091b3339faadb9 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -3440,7 +3440,7 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
 		}
 	}
 	mutex_lock(&adev->smu.mutex);
-	r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
+	r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
 	mutex_unlock(&adev->smu.mutex);
 	if (r)
 		goto fail;
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 34/40] drm/amdgpu: Fix amdgpu_ras_eeprom_init()
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (32 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 33/40] drm/amd/pm: Fix a bug in i2c_xfer Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:12   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 35/40] drm/amdgpu: Simplify RAS EEPROM checksum calculations Luben Tuikov
                   ` (5 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alexander Deucher, Andrey Grodzovsky, Luben Tuikov

No need to account for the 2 bytes of EEPROM
address--this is now well abstracted away by
the fixes the the lower layers.

Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index dc48c556398039..7d0f9e1e62dc4f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -306,7 +306,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
 		return ret;
 	}
 
-	__decode_table_header_from_buff(hdr, &buff[2]);
+	__decode_table_header_from_buff(hdr, buff);
 
 	if (hdr->header == RAS_TABLE_HDR_VAL) {
 		control->num_recs = (hdr->tbl_size - RAS_TABLE_HEADER_SIZE) /
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 35/40] drm/amdgpu: Simplify RAS EEPROM checksum calculations
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (33 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 34/40] drm/amdgpu: Fix amdgpu_ras_eeprom_init() Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-11 17:07   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 36/40] drm/amdgpu: Use explicit cardinality for clarity Luben Tuikov
                   ` (4 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alexander Deucher, Andrey Grodzovsky, Luben Tuikov

Rename update_table_header() to
write_table_header() as this function is actually
writing it to EEPROM.

Use kernel types; use u8 to carry around the
checksum, in order to take advantage of arithmetic
modulo 8-bits (256).

Tidy up to 80 columns.

When updating the checksum, just recalculate the
whole thing.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 98 +++++++++----------
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  2 +-
 2 files changed, 50 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 7d0f9e1e62dc4f..54ef31594accd9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -141,8 +141,8 @@ static void __decode_table_header_from_buff(struct amdgpu_ras_eeprom_table_heade
 	hdr->checksum	      = le32_to_cpu(pp[4]);
 }
 
-static int __update_table_header(struct amdgpu_ras_eeprom_control *control,
-				 unsigned char *buff)
+static int __write_table_header(struct amdgpu_ras_eeprom_control *control,
+				unsigned char *buff)
 {
 	int ret = 0;
 	struct amdgpu_device *adev = to_amdgpu_device(control);
@@ -162,69 +162,74 @@ static int __update_table_header(struct amdgpu_ras_eeprom_control *control,
 	return ret;
 }
 
-static uint32_t  __calc_hdr_byte_sum(struct amdgpu_ras_eeprom_control *control)
+static u8 __calc_hdr_byte_sum(const struct amdgpu_ras_eeprom_control *control)
 {
 	int i;
-	uint32_t tbl_sum = 0;
+	u8 hdr_sum = 0;
+	u8  *p;
+	size_t sz;
 
 	/* Header checksum, skip checksum field in the calculation */
-	for (i = 0; i < sizeof(control->tbl_hdr) - sizeof(control->tbl_hdr.checksum); i++)
-		tbl_sum += *(((unsigned char *)&control->tbl_hdr) + i);
+	sz = sizeof(control->tbl_hdr) - sizeof(control->tbl_hdr.checksum);
+	p = (u8 *) &control->tbl_hdr;
+	for (i = 0; i < sz; i++, p++)
+		hdr_sum += *p;
 
-	return tbl_sum;
+	return hdr_sum;
 }
 
-static uint32_t  __calc_recs_byte_sum(struct eeprom_table_record *records,
-				      int num)
+static u8 __calc_recs_byte_sum(const struct eeprom_table_record *record,
+			       const int num)
 {
 	int i, j;
-	uint32_t tbl_sum = 0;
+	u8  tbl_sum = 0;
+
+	if (!record)
+		return 0;
 
 	/* Records checksum */
 	for (i = 0; i < num; i++) {
-		struct eeprom_table_record *record = &records[i];
+		u8 *p = (u8 *) &record[i];
 
-		for (j = 0; j < sizeof(*record); j++) {
-			tbl_sum += *(((unsigned char *)record) + j);
-		}
+		for (j = 0; j < sizeof(*record); j++, p++)
+			tbl_sum += *p;
 	}
 
 	return tbl_sum;
 }
 
-static inline uint32_t  __calc_tbl_byte_sum(struct amdgpu_ras_eeprom_control *control,
-				  struct eeprom_table_record *records, int num)
+static inline u8
+__calc_tbl_byte_sum(struct amdgpu_ras_eeprom_control *control,
+		    struct eeprom_table_record *records, int num)
 {
-	return __calc_hdr_byte_sum(control) + __calc_recs_byte_sum(records, num);
+	return __calc_hdr_byte_sum(control) +
+		__calc_recs_byte_sum(records, num);
 }
 
-/* Checksum = 256 -((sum of all table entries) mod 256) */
 static void __update_tbl_checksum(struct amdgpu_ras_eeprom_control *control,
-				  struct eeprom_table_record *records, int num,
-				  uint32_t old_hdr_byte_sum)
+				  struct eeprom_table_record *records, int num)
 {
-	/*
-	 * This will update the table sum with new records.
-	 *
-	 * TODO: What happens when the EEPROM table is to be wrapped around
-	 * and old records from start will get overridden.
-	 */
-
-	/* need to recalculate updated header byte sum */
-	control->tbl_byte_sum -= old_hdr_byte_sum;
-	control->tbl_byte_sum += __calc_tbl_byte_sum(control, records, num);
+	u8 v;
 
-	control->tbl_hdr.checksum = 256 - (control->tbl_byte_sum % 256);
+	control->tbl_byte_sum = __calc_tbl_byte_sum(control, records, num);
+	/* Avoid 32-bit sign extension. */
+	v = -control->tbl_byte_sum;
+	control->tbl_hdr.checksum = v;
 }
 
-/* table sum mod 256 + checksum must equals 256 */
-static bool __validate_tbl_checksum(struct amdgpu_ras_eeprom_control *control,
-			    struct eeprom_table_record *records, int num)
+static bool __verify_tbl_checksum(struct amdgpu_ras_eeprom_control *control,
+				  struct eeprom_table_record *records,
+				  int num)
 {
+	u8 result;
+
 	control->tbl_byte_sum = __calc_tbl_byte_sum(control, records, num);
 
-	if (control->tbl_hdr.checksum + (control->tbl_byte_sum % 256) != 256) {
-		DRM_WARN("Checksum mismatch, checksum: %u ", control->tbl_hdr.checksum);
+	result = (u8)control->tbl_hdr.checksum + control->tbl_byte_sum;
+	if (result) {
+		DRM_WARN("RAS table checksum mismatch: stored:0x%02X wants:0x%02hhX",
+			 control->tbl_hdr.checksum,
+			 -control->tbl_byte_sum);
 		return false;
 	}
 
@@ -232,8 +237,8 @@ static bool __validate_tbl_checksum(struct amdgpu_ras_eeprom_control *control,
 }
 
 static int amdgpu_ras_eeprom_correct_header_tag(
-				struct amdgpu_ras_eeprom_control *control,
-				uint32_t header)
+	struct amdgpu_ras_eeprom_control *control,
+	uint32_t header)
 {
 	unsigned char buff[RAS_TABLE_HEADER_SIZE];
 	struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr;
@@ -243,7 +248,7 @@ static int amdgpu_ras_eeprom_correct_header_tag(
 
 	mutex_lock(&control->tbl_mutex);
 	hdr->header = header;
-	ret = __update_table_header(control, buff);
+	ret = __write_table_header(control, buff);
 	mutex_unlock(&control->tbl_mutex);
 
 	return ret;
@@ -262,11 +267,9 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control)
 	hdr->first_rec_offset = RAS_RECORD_START;
 	hdr->tbl_size = RAS_TABLE_HEADER_SIZE;
 
-	control->tbl_byte_sum = 0;
-	__update_tbl_checksum(control, NULL, 0, 0);
+	__update_tbl_checksum(control, NULL, 0);
 	control->next_addr = RAS_RECORD_START;
-
-	ret = __update_table_header(control, buff);
+	ret = __write_table_header(control, buff);
 
 	mutex_unlock(&control->tbl_mutex);
 
@@ -521,8 +524,6 @@ static int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
 	}
 
 	if (write) {
-		uint32_t old_hdr_byte_sum = __calc_hdr_byte_sum(control);
-
 		/*
 		 * Update table header with size and CRC and account for table
 		 * wrap around where the assumption is that we treat it as empty
@@ -537,10 +538,9 @@ static int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
 			control->tbl_hdr.tbl_size = RAS_TABLE_HEADER_SIZE +
 			control->num_recs * RAS_TABLE_RECORD_SIZE;
 
-		__update_tbl_checksum(control, records, num, old_hdr_byte_sum);
-
-		__update_table_header(control, buffs);
-	} else if (!__validate_tbl_checksum(control, records, num)) {
+		__update_tbl_checksum(control, records, num);
+		__write_table_header(control, buffs);
+	} else if (!__verify_tbl_checksum(control, records, num)) {
 		DRM_WARN("EEPROM Table checksum mismatch!");
 		/* TODO Uncomment when EEPROM read/write is relliable */
 		/* ret = -EIO; */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
index fa9c509a8e2f2b..4906ed9fb8cdd3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
@@ -48,7 +48,7 @@ struct amdgpu_ras_eeprom_control {
 	uint32_t next_addr;
 	unsigned int num_recs;
 	struct mutex tbl_mutex;
-	uint32_t tbl_byte_sum;
+	u8 tbl_byte_sum;
 };
 
 /*
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 36/40] drm/amdgpu: Use explicit cardinality for clarity
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (34 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 35/40] drm/amdgpu: Simplify RAS EEPROM checksum calculations Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:17   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 37/40] drm/amdgpu: Optimizations to EEPROM RAS table I/O Luben Tuikov
                   ` (3 subsequent siblings)
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Alexander Deucher, Luben Tuikov, John Clements, Guchun Chen,
	Hawking Zhang

RAS_MAX_RECORD_NUM may mean the maximum record
number, as in the maximum house number on your
street, or it may mean the maximum number of
records, as in the count of records, which is also
a number. To make this distinction whether the
number is ordinal (index) or cardinal (count),
rename this macro to RAS_MAX_RECORD_COUNT.

This makes it easy to understand what it refers
to, especially when we compute quantities such as,
how many records do we have left in the table,
especially when there are so many other numbers,
quantities and numerical macros around.

Also rename the long,
amdgpu_ras_eeprom_get_record_max_length() to the
more succinct and clear,
amdgpu_ras_eeprom_max_record_count().

When computing the threshold, which also deals
with counts, i.e. "how many", use cardinal
"max_eeprom_records_count", than the quantitative
"max_eeprom_records_len".

Simplify the logic here and there, as well.

Cc: Guchun Chen <guchun.chen@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |  9 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       | 50 ++++++++-----------
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    |  8 +--
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  2 +-
 4 files changed, 30 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 3de1accb060e37..0203f654576bcc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -853,11 +853,10 @@ MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = legac
 module_param_named(reset_method, amdgpu_reset_method, int, 0444);
 
 /**
- * DOC: bad_page_threshold (int)
- * Bad page threshold is to specify the threshold value of faulty pages
- * detected by RAS ECC, that may result in GPU entering bad status if total
- * faulty pages by ECC exceed threshold value and leave it for user's further
- * check.
+ * DOC: bad_page_threshold (int) Bad page threshold is specifies the
+ * threshold value of faulty pages detected by RAS ECC, which may
+ * result in the GPU entering bad status when the number of total
+ * faulty pages by ECC exceeds the threshold value.
  */
 MODULE_PARM_DESC(bad_page_threshold, "Bad page threshold(-1 = auto(default value), 0 = disable bad page retirement)");
 module_param_named(bad_page_threshold, amdgpu_bad_page_threshold, int, 0444);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 66c96c65e7eeb9..95ab400b641af0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -71,8 +71,8 @@ const char *ras_block_string[] = {
 /* inject address is 52 bits */
 #define	RAS_UMC_INJECT_ADDR_LIMIT	(0x1ULL << 52)
 
-/* typical ECC bad page rate(1 bad page per 100MB VRAM) */
-#define RAS_BAD_PAGE_RATE		(100 * 1024 * 1024ULL)
+/* typical ECC bad page rate is 1 bad page per 100MB VRAM */
+#define RAS_BAD_PAGE_COVER              (100 * 1024 * 1024ULL)
 
 enum amdgpu_ras_retire_page_reservation {
 	AMDGPU_RAS_RETIRE_PAGE_RESERVED,
@@ -1841,27 +1841,24 @@ int amdgpu_ras_save_bad_pages(struct amdgpu_device *adev)
 static int amdgpu_ras_load_bad_pages(struct amdgpu_device *adev)
 {
 	struct amdgpu_ras_eeprom_control *control =
-					&adev->psp.ras.ras->eeprom_control;
-	struct eeprom_table_record *bps = NULL;
-	int ret = 0;
+		&adev->psp.ras.ras->eeprom_control;
+	struct eeprom_table_record *bps;
+	int ret;
 
 	/* no bad page record, skip eeprom access */
-	if (!control->num_recs || (amdgpu_bad_page_threshold == 0))
-		return ret;
+	if (control->num_recs == 0 || amdgpu_bad_page_threshold == 0)
+		return 0;
 
 	bps = kcalloc(control->num_recs, sizeof(*bps), GFP_KERNEL);
 	if (!bps)
 		return -ENOMEM;
 
-	if (amdgpu_ras_eeprom_read(control, bps, control->num_recs)) {
+	ret = amdgpu_ras_eeprom_read(control, bps, control->num_recs);
+	if (ret)
 		dev_err(adev->dev, "Failed to load EEPROM table records!");
-		ret = -EIO;
-		goto out;
-	}
-
-	ret = amdgpu_ras_add_bad_pages(adev, bps, control->num_recs);
+	else
+		ret = amdgpu_ras_add_bad_pages(adev, bps, control->num_recs);
 
-out:
 	kfree(bps);
 	return ret;
 }
@@ -1901,11 +1898,9 @@ static bool amdgpu_ras_check_bad_page(struct amdgpu_device *adev,
 }
 
 static void amdgpu_ras_validate_threshold(struct amdgpu_device *adev,
-					uint32_t max_length)
+					  uint32_t max_count)
 {
 	struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
-	int tmp_threshold = amdgpu_bad_page_threshold;
-	u64 val;
 
 	/*
 	 * Justification of value bad_page_cnt_threshold in ras structure
@@ -1926,18 +1921,15 @@ static void amdgpu_ras_validate_threshold(struct amdgpu_device *adev,
 	 *      take no effect.
 	 */
 
-	if (tmp_threshold < -1)
-		tmp_threshold = -1;
-	else if (tmp_threshold > max_length)
-		tmp_threshold = max_length;
+	if (amdgpu_bad_page_threshold < 0) {
+		u64 val = adev->gmc.mc_vram_size;
 
-	if (tmp_threshold == -1) {
-		val = adev->gmc.mc_vram_size;
-		do_div(val, RAS_BAD_PAGE_RATE);
+		do_div(val, RAS_BAD_PAGE_COVER);
 		con->bad_page_cnt_threshold = min(lower_32_bits(val),
-						max_length);
+						  max_count);
 	} else {
-		con->bad_page_cnt_threshold = tmp_threshold;
+		con->bad_page_cnt_threshold = min_t(int, max_count,
+						    amdgpu_bad_page_threshold);
 	}
 }
 
@@ -1945,7 +1937,7 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
 {
 	struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
 	struct ras_err_handler_data **data;
-	uint32_t max_eeprom_records_len = 0;
+	u32  max_eeprom_records_count = 0;
 	bool exc_err_limit = false;
 	int ret;
 
@@ -1965,8 +1957,8 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
 	atomic_set(&con->in_recovery, 0);
 	con->adev = adev;
 
-	max_eeprom_records_len = amdgpu_ras_eeprom_get_record_max_length();
-	amdgpu_ras_validate_threshold(adev, max_eeprom_records_len);
+	max_eeprom_records_count = amdgpu_ras_eeprom_max_record_count();
+	amdgpu_ras_validate_threshold(adev, max_eeprom_records_count);
 
 	/* Todo: During test the SMU might fail to read the eeprom through I2C
 	 * when the GPU is pending on XGMI reset during probe time
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 54ef31594accd9..21e1e59e4857ff 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -54,7 +54,7 @@
 #define RAS_TBL_SIZE_BYTES      (256 * 1024)
 #define RAS_HDR_START           0
 #define RAS_RECORD_START        (RAS_HDR_START + RAS_TABLE_HEADER_SIZE)
-#define RAS_MAX_RECORD_NUM      ((RAS_TBL_SIZE_BYTES - RAS_TABLE_HEADER_SIZE) \
+#define RAS_MAX_RECORD_COUNT    ((RAS_TBL_SIZE_BYTES - RAS_TABLE_HEADER_SIZE) \
 				 / RAS_TABLE_RECORD_SIZE)
 
 #define to_amdgpu_device(x) (container_of(x, struct amdgpu_ras, eeprom_control))->adev
@@ -532,7 +532,7 @@ static int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
 		 * TODO - Check the assumption is correct
 		 */
 		control->num_recs += num;
-		control->num_recs %= RAS_MAX_RECORD_NUM;
+		control->num_recs %= RAS_MAX_RECORD_COUNT;
 		control->tbl_hdr.tbl_size += RAS_TABLE_RECORD_SIZE * num;
 		if (control->tbl_hdr.tbl_size > RAS_TBL_SIZE_BYTES)
 			control->tbl_hdr.tbl_size = RAS_TABLE_HEADER_SIZE +
@@ -568,9 +568,9 @@ int amdgpu_ras_eeprom_write(struct amdgpu_ras_eeprom_control *control,
 	return amdgpu_ras_eeprom_xfer(control, records, num, true);
 }
 
-inline uint32_t amdgpu_ras_eeprom_get_record_max_length(void)
+inline uint32_t amdgpu_ras_eeprom_max_record_count(void)
 {
-	return RAS_MAX_RECORD_NUM;
+	return RAS_MAX_RECORD_COUNT;
 }
 
 /* Used for testing if bugs encountered */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
index 4906ed9fb8cdd3..504729b8053759 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
@@ -88,7 +88,7 @@ int amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control *control,
 int amdgpu_ras_eeprom_write(struct amdgpu_ras_eeprom_control *control,
 			    struct eeprom_table_record *records, const u32 num);
 
-inline uint32_t amdgpu_ras_eeprom_get_record_max_length(void);
+inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
 
 void amdgpu_ras_eeprom_test(struct amdgpu_ras_eeprom_control *control);
 
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 37/40] drm/amdgpu: Optimizations to EEPROM RAS table I/O
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (35 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 36/40] drm/amdgpu: Use explicit cardinality for clarity Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-08 21:39 ` [PATCH 38/40] drm/amdgpu: RAS EEPROM table is now in debugfs Luben Tuikov
                   ` (2 subsequent siblings)
  39 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alexander Deucher, Andrey Grodzovsky, Luben Tuikov

Read and write the table in one go, then using a
separate stage to decode or encode the data and
reading/writing the table, as opposed to on the
fly, which keeps the I2C bus busy. Use a single
read/write to read/write the table or at most two
if the number of records we're reading/writing
wraps around.

Check the check-sum of a table in EEPROM on init.

When updating the table header signature, when the
threshold was increased on boot, also update the
check-sum at that time.

Split functionality between read and write, which
simplifies the code and exposes areas of
optimization and complexity.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c    |  20 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h    |  23 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |  20 +-
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 903 +++++++++++-------
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  53 +-
 5 files changed, 655 insertions(+), 364 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
index a4815af111ed12..4c3c65a5acae9b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
@@ -176,8 +176,8 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
  *
  * Returns the number of bytes read/written; -errno on error.
  */
-int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
-		       u8 *eeprom_buf, u16 buf_size, bool read)
+static int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
+			      u8 *eeprom_buf, u16 buf_size, bool read)
 {
 	const struct i2c_adapter_quirks *quirks = i2c_adap->quirks;
 	u16 limit;
@@ -221,3 +221,19 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
 		return res;
 	}
 }
+
+int amdgpu_eeprom_read(struct i2c_adapter *i2c_adap,
+		       u32 eeprom_addr, u8 *eeprom_buf,
+		       u16 bytes)
+{
+	return amdgpu_eeprom_xfer(i2c_adap, eeprom_addr, eeprom_buf, bytes,
+				  true);
+}
+
+int amdgpu_eeprom_write(struct i2c_adapter *i2c_adap,
+			u32 eeprom_addr, u8 *eeprom_buf,
+			u16 bytes)
+{
+	return amdgpu_eeprom_xfer(i2c_adap, eeprom_addr, eeprom_buf, bytes,
+				  false);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
index 966b434f0de2b7..6935adb2be1f1c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
@@ -26,23 +26,12 @@
 
 #include <linux/i2c.h>
 
-int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
-		       u8 *eeprom_buf, u16 bytes, bool read);
+int amdgpu_eeprom_read(struct i2c_adapter *i2c_adap,
+		       u32 eeprom_addr, u8 *eeprom_buf,
+		       u16 bytes);
 
-static inline int amdgpu_eeprom_read(struct i2c_adapter *i2c_adap,
-				     u32 eeprom_addr, u8 *eeprom_buf,
-				     u16 bytes)
-{
-	return amdgpu_eeprom_xfer(i2c_adap, eeprom_addr, eeprom_buf, bytes,
-				  true);
-}
-
-static inline int amdgpu_eeprom_write(struct i2c_adapter *i2c_adap,
-				      u32 eeprom_addr, u8 *eeprom_buf,
-				      u16 bytes)
-{
-	return amdgpu_eeprom_xfer(i2c_adap, eeprom_addr, eeprom_buf, bytes,
-				  false);
-}
+int amdgpu_eeprom_write(struct i2c_adapter *i2c_adap,
+			u32 eeprom_addr, u8 *eeprom_buf,
+			u16 bytes);
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 95ab400b641af0..1424f2cc2076c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -451,7 +451,7 @@ static ssize_t amdgpu_ras_debugfs_eeprom_write(struct file *f,
 	ret = amdgpu_ras_eeprom_reset_table(
 		&(amdgpu_ras_get_context(adev)->eeprom_control));
 
-	if (ret > 0) {
+	if (!ret) {
 		/* Something was written to EEPROM.
 		 */
 		amdgpu_ras_get_context(adev)->flags = RAS_DEFAULT_FLAGS;
@@ -1818,12 +1818,12 @@ int amdgpu_ras_save_bad_pages(struct amdgpu_device *adev)
 
 	control = &con->eeprom_control;
 	data = con->eh_data;
-	save_count = data->count - control->num_recs;
+	save_count = data->count - control->ras_num_recs;
 	/* only new entries are saved */
 	if (save_count > 0) {
-		if (amdgpu_ras_eeprom_write(control,
-					    &data->bps[control->num_recs],
-					    save_count)) {
+		if (amdgpu_ras_eeprom_append(control,
+					     &data->bps[control->ras_num_recs],
+					     save_count)) {
 			dev_err(adev->dev, "Failed to save EEPROM table data!");
 			return -EIO;
 		}
@@ -1846,18 +1846,18 @@ static int amdgpu_ras_load_bad_pages(struct amdgpu_device *adev)
 	int ret;
 
 	/* no bad page record, skip eeprom access */
-	if (control->num_recs == 0 || amdgpu_bad_page_threshold == 0)
+	if (control->ras_num_recs == 0 || amdgpu_bad_page_threshold == 0)
 		return 0;
 
-	bps = kcalloc(control->num_recs, sizeof(*bps), GFP_KERNEL);
+	bps = kcalloc(control->ras_num_recs, sizeof(*bps), GFP_KERNEL);
 	if (!bps)
 		return -ENOMEM;
 
-	ret = amdgpu_ras_eeprom_read(control, bps, control->num_recs);
+	ret = amdgpu_ras_eeprom_read(control, bps, control->ras_num_recs);
 	if (ret)
 		dev_err(adev->dev, "Failed to load EEPROM table records!");
 	else
-		ret = amdgpu_ras_add_bad_pages(adev, bps, control->num_recs);
+		ret = amdgpu_ras_add_bad_pages(adev, bps, control->ras_num_recs);
 
 	kfree(bps);
 	return ret;
@@ -1974,7 +1974,7 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
 	if (exc_err_limit || ret)
 		goto free;
 
-	if (con->eeprom_control.num_recs) {
+	if (con->eeprom_control.ras_num_recs) {
 		ret = amdgpu_ras_load_bad_pages(adev);
 		if (ret)
 			goto free;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 21e1e59e4857ff..dc4a845a32404c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -52,11 +52,27 @@
 
 /* Assume 2-Mbit size EEPROM and take up the whole space. */
 #define RAS_TBL_SIZE_BYTES      (256 * 1024)
-#define RAS_HDR_START           0
+#define RAS_TABLE_START         0
+#define RAS_HDR_START           RAS_TABLE_START
 #define RAS_RECORD_START        (RAS_HDR_START + RAS_TABLE_HEADER_SIZE)
 #define RAS_MAX_RECORD_COUNT    ((RAS_TBL_SIZE_BYTES - RAS_TABLE_HEADER_SIZE) \
 				 / RAS_TABLE_RECORD_SIZE)
 
+/* Given a zero-based index of an EEPROM RAS record, yields the EEPROM
+ * offset off of RAS_TABLE_START.  That is, this is something you can
+ * add to control->i2c_address, and then tell I2C layer to read
+ * from/write to there. _N is the so called absolute index,
+ * because it starts right after the table header.
+ */
+#define RAS_INDEX_TO_OFFSET(_C, _N) ((_C)->ras_record_offset + \
+				     (_N) * RAS_TABLE_RECORD_SIZE)
+
+#define RAS_OFFSET_TO_INDEX(_C, _O) (((_O) - \
+				      (_C)->ras_record_offset) / RAS_TABLE_RECORD_SIZE)
+
+#define RAS_NUM_RECS(_tbl_hdr)  (((_tbl_hdr)->tbl_size - \
+				  RAS_TABLE_HEADER_SIZE) / RAS_TABLE_RECORD_SIZE)
+
 #define to_amdgpu_device(x) (container_of(x, struct amdgpu_ras, eeprom_control))->adev
 
 static bool __is_ras_eeprom_supported(struct amdgpu_device *adev)
@@ -117,10 +133,11 @@ static bool __get_eeprom_i2c_addr(struct amdgpu_device *adev,
 	return true;
 }
 
-static void __encode_table_header_to_buff(struct amdgpu_ras_eeprom_table_header *hdr,
-					  unsigned char *buff)
+static void
+__encode_table_header_to_buf(struct amdgpu_ras_eeprom_table_header *hdr,
+			     unsigned char *buf)
 {
-	uint32_t *pp = (uint32_t *) buff;
+	u32 *pp = (uint32_t *)buf;
 
 	pp[0] = cpu_to_le32(hdr->header);
 	pp[1] = cpu_to_le32(hdr->version);
@@ -129,10 +146,11 @@ static void __encode_table_header_to_buff(struct amdgpu_ras_eeprom_table_header
 	pp[4] = cpu_to_le32(hdr->checksum);
 }
 
-static void __decode_table_header_from_buff(struct amdgpu_ras_eeprom_table_header *hdr,
-					  unsigned char *buff)
+static void
+__decode_table_header_from_buf(struct amdgpu_ras_eeprom_table_header *hdr,
+			       unsigned char *buf)
 {
-	uint32_t *pp = (uint32_t *)buff;
+	u32 *pp = (uint32_t *)buf;
 
 	hdr->header	      = le32_to_cpu(pp[0]);
 	hdr->version	      = le32_to_cpu(pp[1]);
@@ -141,276 +159,166 @@ static void __decode_table_header_from_buff(struct amdgpu_ras_eeprom_table_heade
 	hdr->checksum	      = le32_to_cpu(pp[4]);
 }
 
-static int __write_table_header(struct amdgpu_ras_eeprom_control *control,
-				unsigned char *buff)
+static int __write_table_header(struct amdgpu_ras_eeprom_control *control)
 {
-	int ret = 0;
+	u8 buf[RAS_TABLE_HEADER_SIZE];
 	struct amdgpu_device *adev = to_amdgpu_device(control);
+	int res;
 
-	__encode_table_header_to_buff(&control->tbl_hdr, buff);
+	memset(buf, 0, sizeof(buf));
+	__encode_table_header_to_buf(&control->tbl_hdr, buf);
 
 	/* i2c may be unstable in gpu reset */
 	down_read(&adev->reset_sem);
-	ret = amdgpu_eeprom_write(&adev->pm.smu_i2c,
-				  control->i2c_address + RAS_HDR_START,
-				  buff, RAS_TABLE_HEADER_SIZE);
+	res = amdgpu_eeprom_write(&adev->pm.smu_i2c,
+				  control->i2c_address +
+				  control->ras_header_offset,
+				  buf, RAS_TABLE_HEADER_SIZE);
 	up_read(&adev->reset_sem);
 
-	if (ret < 1)
-		DRM_ERROR("Failed to write EEPROM table header, ret:%d", ret);
+	if (res < 0) {
+		DRM_ERROR("Failed to write EEPROM table header:%d", res);
+	} else if (res < RAS_TABLE_HEADER_SIZE) {
+		DRM_ERROR("Short write:%d out of %d\n",
+			  res, RAS_TABLE_HEADER_SIZE);
+		res = -EIO;
+	} else {
+		res = 0;
+	}
 
-	return ret;
+	return res;
 }
 
 static u8 __calc_hdr_byte_sum(const struct amdgpu_ras_eeprom_control *control)
 {
-	int i;
-	u8 hdr_sum = 0;
-	u8  *p;
+	int ii;
+	u8  *pp, csum;
 	size_t sz;
 
 	/* Header checksum, skip checksum field in the calculation */
 	sz = sizeof(control->tbl_hdr) - sizeof(control->tbl_hdr.checksum);
-	p = (u8 *) &control->tbl_hdr;
-	for (i = 0; i < sz; i++, p++)
-		hdr_sum += *p;
-
-	return hdr_sum;
-}
-
-static u8 __calc_recs_byte_sum(const struct eeprom_table_record *record,
-			       const int num)
-{
-	int i, j;
-	u8  tbl_sum = 0;
-
-	if (!record)
-		return 0;
-
-	/* Records checksum */
-	for (i = 0; i < num; i++) {
-		u8 *p = (u8 *) &record[i];
-
-		for (j = 0; j < sizeof(*record); j++, p++)
-			tbl_sum += *p;
-	}
-
-	return tbl_sum;
-}
-
-static inline u8
-__calc_tbl_byte_sum(struct amdgpu_ras_eeprom_control *control,
-		    struct eeprom_table_record *records, int num)
-{
-	return __calc_hdr_byte_sum(control) +
-		__calc_recs_byte_sum(records, num);
-}
-
-static void __update_tbl_checksum(struct amdgpu_ras_eeprom_control *control,
-				  struct eeprom_table_record *records, int num)
-{
-	u8 v;
-
-	control->tbl_byte_sum = __calc_tbl_byte_sum(control, records, num);
-	/* Avoid 32-bit sign extension. */
-	v = -control->tbl_byte_sum;
-	control->tbl_hdr.checksum = v;
-}
-
-static bool __verify_tbl_checksum(struct amdgpu_ras_eeprom_control *control,
-				  struct eeprom_table_record *records,
-				  int num)
-{
-	u8 result;
-
-	control->tbl_byte_sum = __calc_tbl_byte_sum(control, records, num);
+	pp = (u8 *) &control->tbl_hdr;
+	csum = 0;
+	for (ii = 0; ii < sz; ii++, pp++)
+		csum += *pp;
 
-	result = (u8)control->tbl_hdr.checksum + control->tbl_byte_sum;
-	if (result) {
-		DRM_WARN("RAS table checksum mismatch: stored:0x%02X wants:0x%02hhX",
-			 control->tbl_hdr.checksum,
-			 -control->tbl_byte_sum);
-		return false;
-	}
-
-	return true;
+	return csum;
 }
 
 static int amdgpu_ras_eeprom_correct_header_tag(
 	struct amdgpu_ras_eeprom_control *control,
 	uint32_t header)
 {
-	unsigned char buff[RAS_TABLE_HEADER_SIZE];
 	struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr;
-	int ret = 0;
-
-	memset(buff, 0, RAS_TABLE_HEADER_SIZE);
-
-	mutex_lock(&control->tbl_mutex);
+	u8 *hh;
+	int res;
+	u8 csum;
+
+	csum = -hdr->checksum;
+
+	hh = (void *) &hdr->header;
+	csum -= (hh[0] + hh[1] + hh[2] + hh[3]);
+	hh = (void *) &header;
+	csum += hh[0] + hh[1] + hh[2] + hh[3];
+	csum = -csum;
+	mutex_lock(&control->ras_tbl_mutex);
 	hdr->header = header;
-	ret = __write_table_header(control, buff);
-	mutex_unlock(&control->tbl_mutex);
+	hdr->checksum = csum;
+	res = __write_table_header(control);
+	mutex_unlock(&control->ras_tbl_mutex);
 
-	return ret;
+	return res;
 }
 
+/**
+ * amdgpu_ras_eeprom_reset_table -- Reset the RAS EEPROM table
+ * @control: pointer to control structure
+ *
+ * Reset the contents of the header of the RAS EEPROM table.
+ * Return 0 on success, -errno on error.
+ */
 int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control)
 {
-	unsigned char buff[RAS_TABLE_HEADER_SIZE] = { 0 };
 	struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr;
-	int ret = 0;
+	u8 csum;
+	int res;
 
-	mutex_lock(&control->tbl_mutex);
+	mutex_lock(&control->ras_tbl_mutex);
 
 	hdr->header = RAS_TABLE_HDR_VAL;
 	hdr->version = RAS_TABLE_VER;
 	hdr->first_rec_offset = RAS_RECORD_START;
 	hdr->tbl_size = RAS_TABLE_HEADER_SIZE;
 
-	__update_tbl_checksum(control, NULL, 0);
-	control->next_addr = RAS_RECORD_START;
-	ret = __write_table_header(control, buff);
+	csum = __calc_hdr_byte_sum(control);
+	csum = -csum;
+	hdr->checksum = csum;
+	res = __write_table_header(control);
 
-	mutex_unlock(&control->tbl_mutex);
+	control->ras_num_recs = 0;
+	control->ras_fri = 0;
 
-	return ret;
+	mutex_unlock(&control->ras_tbl_mutex);
 
+	return res;
 }
 
-int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
-			   bool *exceed_err_limit)
-{
-	int ret = 0;
-	struct amdgpu_device *adev = to_amdgpu_device(control);
-	unsigned char buff[RAS_TABLE_HEADER_SIZE] = { 0 };
-	struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr;
-	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
-
-	*exceed_err_limit = false;
-
-	if (!__is_ras_eeprom_supported(adev))
-		return 0;
-
-	/* Verify i2c adapter is initialized */
-	if (!adev->pm.smu_i2c.algo)
-		return -ENOENT;
-
-	if (!__get_eeprom_i2c_addr(adev, control))
-		return -EINVAL;
-
-	mutex_init(&control->tbl_mutex);
-
-	/* Read/Create table header from EEPROM address 0 */
-	ret = amdgpu_eeprom_read(&adev->pm.smu_i2c,
-				 control->i2c_address + RAS_HDR_START,
-				 buff, RAS_TABLE_HEADER_SIZE);
-	if (ret < 1) {
-		DRM_ERROR("Failed to read EEPROM table header, ret:%d", ret);
-		return ret;
-	}
-
-	__decode_table_header_from_buff(hdr, buff);
-
-	if (hdr->header == RAS_TABLE_HDR_VAL) {
-		control->num_recs = (hdr->tbl_size - RAS_TABLE_HEADER_SIZE) /
-				    RAS_TABLE_RECORD_SIZE;
-		control->tbl_byte_sum = __calc_hdr_byte_sum(control);
-		control->next_addr = RAS_RECORD_START;
-
-		DRM_DEBUG_DRIVER("Found existing EEPROM table with %d records",
-				 control->num_recs);
-
-	} else if ((hdr->header == RAS_TABLE_HDR_BAD) &&
-			(amdgpu_bad_page_threshold != 0)) {
-		if (ras->bad_page_cnt_threshold > control->num_recs) {
-			dev_info(adev->dev, "Using one valid bigger bad page "
-				"threshold and correcting eeprom header tag.\n");
-			ret = amdgpu_ras_eeprom_correct_header_tag(control,
-							RAS_TABLE_HDR_VAL);
-		} else {
-			*exceed_err_limit = true;
-			dev_err(adev->dev, "Exceeding the bad_page_threshold parameter, "
-				"disabling the GPU.\n");
-		}
-	} else {
-		DRM_INFO("Creating new EEPROM table");
-
-		ret = amdgpu_ras_eeprom_reset_table(control);
-	}
-
-	return ret > 0 ? 0 : -EIO;
-}
-
-static void __encode_table_record_to_buff(struct amdgpu_ras_eeprom_control *control,
-					  struct eeprom_table_record *record,
-					  unsigned char *buff)
+static void
+__encode_table_record_to_buf(struct amdgpu_ras_eeprom_control *control,
+			     struct eeprom_table_record *record,
+			     unsigned char *buf)
 {
 	__le64 tmp = 0;
 	int i = 0;
 
 	/* Next are all record fields according to EEPROM page spec in LE foramt */
-	buff[i++] = record->err_type;
+	buf[i++] = record->err_type;
 
-	buff[i++] = record->bank;
+	buf[i++] = record->bank;
 
 	tmp = cpu_to_le64(record->ts);
-	memcpy(buff + i, &tmp, 8);
+	memcpy(buf + i, &tmp, 8);
 	i += 8;
 
 	tmp = cpu_to_le64((record->offset & 0xffffffffffff));
-	memcpy(buff + i, &tmp, 6);
+	memcpy(buf + i, &tmp, 6);
 	i += 6;
 
-	buff[i++] = record->mem_channel;
-	buff[i++] = record->mcumc_id;
+	buf[i++] = record->mem_channel;
+	buf[i++] = record->mcumc_id;
 
 	tmp = cpu_to_le64((record->retired_page & 0xffffffffffff));
-	memcpy(buff + i, &tmp, 6);
+	memcpy(buf + i, &tmp, 6);
 }
 
-static void __decode_table_record_from_buff(struct amdgpu_ras_eeprom_control *control,
-					    struct eeprom_table_record *record,
-					    unsigned char *buff)
+static void
+__decode_table_record_from_buf(struct amdgpu_ras_eeprom_control *control,
+			       struct eeprom_table_record *record,
+			       unsigned char *buf)
 {
 	__le64 tmp = 0;
 	int i =  0;
 
 	/* Next are all record fields according to EEPROM page spec in LE foramt */
-	record->err_type = buff[i++];
+	record->err_type = buf[i++];
 
-	record->bank = buff[i++];
+	record->bank = buf[i++];
 
-	memcpy(&tmp, buff + i, 8);
+	memcpy(&tmp, buf + i, 8);
 	record->ts = le64_to_cpu(tmp);
 	i += 8;
 
-	memcpy(&tmp, buff + i, 6);
+	memcpy(&tmp, buf + i, 6);
 	record->offset = (le64_to_cpu(tmp) & 0xffffffffffff);
 	i += 6;
 
-	record->mem_channel = buff[i++];
-	record->mcumc_id = buff[i++];
+	record->mem_channel = buf[i++];
+	record->mcumc_id = buf[i++];
 
-	memcpy(&tmp, buff + i,  6);
+	memcpy(&tmp, buf + i,  6);
 	record->retired_page = (le64_to_cpu(tmp) & 0xffffffffffff);
 }
 
-/*
- * When reaching end of EEPROM memory jump back to 0 record address
- */
-static uint32_t __correct_eeprom_dest_address(uint32_t curr_address)
-{
-	u32 next_address = curr_address + RAS_TABLE_RECORD_SIZE;
-
-	/* When all EEPROM memory used jump back to 0 address */
-	if (next_address >= RAS_TBL_SIZE_BYTES) {
-		DRM_INFO("Reached end of EEPROM memory, wrap around to 0.");
-		return RAS_RECORD_START;
-	}
-
-	return curr_address;
-}
-
 bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev)
 {
 	struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
@@ -427,145 +335,398 @@ bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev)
 
 	if (con->eeprom_control.tbl_hdr.header == RAS_TABLE_HDR_BAD) {
 		dev_warn(adev->dev, "This GPU is in BAD status.");
-		dev_warn(adev->dev, "Please retire it or setting one bigger "
-				"threshold value when reloading driver.\n");
+		dev_warn(adev->dev, "Please retire it or set a larger "
+			 "threshold value when reloading driver.\n");
 		return true;
 	}
 
 	return false;
 }
 
-static int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
-				  struct eeprom_table_record *records,
-				  const u32 num, bool write)
+/**
+ * __amdgpu_ras_eeprom_write -- write indexed from buffer to EEPROM
+ * @control: pointer to control structure
+ * @buf: pointer to buffer containing data to write
+ * @fri: start writing at this index
+ * @num: number of records to write
+ *
+ * The caller must hold the table mutex in @control.
+ * Return 0 on success, -errno otherwise.
+ */
+static int __amdgpu_ras_eeprom_write(struct amdgpu_ras_eeprom_control *control,
+				     u8 *buf, const u32 fri, const u32 num)
 {
-	int i, ret = 0;
-	unsigned char *buffs, *buff;
-	struct eeprom_table_record *record;
 	struct amdgpu_device *adev = to_amdgpu_device(control);
-	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
+	u32 buf_size;
+	int res;
 
-	if (!__is_ras_eeprom_supported(adev))
-		return 0;
+	/* i2c may be unstable in gpu reset */
+	down_read(&adev->reset_sem);
+	buf_size = num * RAS_TABLE_RECORD_SIZE;
+	res = amdgpu_eeprom_write(&adev->pm.smu_i2c,
+				  control->i2c_address +
+				  RAS_INDEX_TO_OFFSET(control, fri),
+				  buf, buf_size);
+	up_read(&adev->reset_sem);
+	if (res < 0) {
+		DRM_ERROR("Writing %d EEPROM table records error:%d",
+			  num, res);
+	} else if (res < buf_size) {
+		/* Short write, return error.
+		 */
+		DRM_ERROR("Wrote %d records out of %d",
+			  res / RAS_TABLE_RECORD_SIZE, num);
+		res = -EIO;
+	} else {
+		res = 0;
+	}
+
+	return res;
+}
+
+static int
+amdgpu_ras_eeprom_append_table(struct amdgpu_ras_eeprom_control *control,
+			       struct eeprom_table_record *record,
+			       const u32 num)
+{
+	u32 a, b, i;
+	u8 *buf, *pp;
+	int res;
 
-	buffs = kcalloc(num, RAS_TABLE_RECORD_SIZE, GFP_KERNEL);
-	if (!buffs)
+	buf = kcalloc(num, RAS_TABLE_RECORD_SIZE, GFP_KERNEL);
+	if (!buf)
 		return -ENOMEM;
 
-	mutex_lock(&control->tbl_mutex);
+	/* Encode all of them in one go.
+	 */
+	pp = buf;
+	for (i = 0; i < num; i++, pp += RAS_TABLE_RECORD_SIZE)
+		__encode_table_record_to_buf(control, &record[i], pp);
+
+	/* a, first record index to write into.
+	 * b, last record index to write into.
+	 * a = first index to read (fri) + number of records in the table,
+	 * b = a + @num - 1.
+	 * Let N = control->ras_max_num_record_count, then we have,
+	 * case 0: 0 <= a <= b < N,
+	 *   just append @num records starting at a;
+	 * case 1: 0 <= a < N <= b,
+	 *   append (N - a) records starting at a, and
+	 *   append the remainder,  b % N + 1, starting at 0.
+	 * case 2: 0 <= fri < N <= a <= b, then modulo N we get two subcases,
+	 * case 2a: 0 <= a <= b < N
+	 *   append num records starting at a; and fix fri if b overwrote it,
+	 *   and since a <= b, if b overwrote it then a must've also,
+	 *   and if b didn't overwrite it, then a didn't also.
+	 * case 2b: 0 <= b < a < N
+	 *   write num records starting at a, which wraps around 0=N
+	 *   and overwrite fri unconditionally. Now from case 2a,
+	 *   this means that b eclipsed fri to overwrite it and wrap
+	 *   around 0 again, i.e. b = 2N+r pre modulo N, so we unconditionally
+	 *   set fri = b + 1 (mod N).
+	 * Now, since fri is updated in every case, except the trivial case 0,
+	 * the number of records present in the table after writing, is,
+	 * num_recs - 1 = b - fri (mod N), and we take the positive value,
+	 * by adding an arbitrary multiple of N before taking the modulo N
+	 * as shown below.
+	 */
+	a = control->ras_fri + control->ras_num_recs;
+	b = a + num  - 1;
+	if (b < control->ras_max_record_count) {
+		res = __amdgpu_ras_eeprom_write(control, buf, a, num);
+	} else if (a < control->ras_max_record_count) {
+		u32 g0, g1;
+
+		g0 = control->ras_max_record_count - a;
+		g1 = b % control->ras_max_record_count + 1;
+		res = __amdgpu_ras_eeprom_write(control, buf, a, g0);
+		if (res)
+			goto Out;
+		res = __amdgpu_ras_eeprom_write(control,
+						buf + g0 * RAS_TABLE_RECORD_SIZE,
+						0, g1);
+		if (res)
+			goto Out;
+		if (g1 > control->ras_fri)
+			control->ras_fri = g1 % control->ras_max_record_count;
+	} else {
+		a %= control->ras_max_record_count;
+		b %= control->ras_max_record_count;
+
+		if (a <= b) {
+			/* Note that, b - a + 1 = num. */
+			res = __amdgpu_ras_eeprom_write(control, buf, a, num);
+			if (res)
+				goto Out;
+			if (b >= control->ras_fri)
+				control->ras_fri = (b + 1) % control->ras_max_record_count;
+		} else {
+			u32 g0, g1;
+
+			/* b < a, which means, we write from
+			 * a to the end of the table, and from
+			 * the start of the table to b.
+			 */
+			g0 = control->ras_max_record_count - a;
+			g1 = b + 1;
+			res = __amdgpu_ras_eeprom_write(control, buf, a, g0);
+			if (res)
+				goto Out;
+			res = __amdgpu_ras_eeprom_write(control,
+							buf + g0 * RAS_TABLE_RECORD_SIZE,
+							0, g1);
+			if (res)
+				goto Out;
+			control->ras_fri = g1 % control->ras_max_record_count;
+		}
+	}
+	control->ras_num_recs = 1 + (control->ras_max_record_count + b
+				     - control->ras_fri)
+		% control->ras_max_record_count;
+Out:
+	kfree(buf);
+	return res;
+}
+
+static int
+amdgpu_ras_eeprom_update_header(struct amdgpu_ras_eeprom_control *control)
+{
+	struct amdgpu_device *adev = to_amdgpu_device(control);
+	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
+	u8 *buf, *pp, csum;
+	u32 buf_size;
+	int res;
 
-	/*
-	 * If saved bad pages number exceeds the bad page threshold for
-	 * the whole VRAM, update table header to mark the BAD GPU tag
-	 * and schedule one ras recovery after eeprom write is done,
-	 * this can avoid the missing for latest records.
-	 *
-	 * This new header will be picked up and checked in the bootup
-	 * by ras recovery, which may break bootup process to notify
-	 * user this GPU is in bad state and to retire such GPU for
-	 * further check.
+	/* Modify the header if it exceeds.
 	 */
-	if (write && (amdgpu_bad_page_threshold != 0) &&
-		((control->num_recs + num) >= ras->bad_page_cnt_threshold)) {
+	if (amdgpu_bad_page_threshold != 0 &&
+	    control->ras_num_recs >= ras->bad_page_cnt_threshold) {
 		dev_warn(adev->dev,
-			"Saved bad pages(%d) reaches threshold value(%d).\n",
-			control->num_recs + num, ras->bad_page_cnt_threshold);
+			"Saved bad pages %d reaches threshold value %d\n",
+			control->ras_num_recs, ras->bad_page_cnt_threshold);
 		control->tbl_hdr.header = RAS_TABLE_HDR_BAD;
 	}
 
-	/* In case of overflow just start from beginning to not lose newest records */
-	if (write &&
-	    (control->next_addr +
-	     RAS_TABLE_RECORD_SIZE * num >= RAS_TBL_SIZE_BYTES))
-		control->next_addr = RAS_RECORD_START;
+	control->tbl_hdr.version = RAS_TABLE_VER;
+	control->tbl_hdr.first_rec_offset = RAS_INDEX_TO_OFFSET(control, control->ras_fri);
+	control->tbl_hdr.tbl_size = RAS_TABLE_HEADER_SIZE + control->ras_num_recs * RAS_TABLE_RECORD_SIZE;
+	control->tbl_hdr.checksum = 0;
+
+	buf_size = control->ras_num_recs * RAS_TABLE_RECORD_SIZE;
+	buf = kcalloc(control->ras_num_recs, RAS_TABLE_RECORD_SIZE, GFP_KERNEL);
+	if (!buf) {
+		DRM_ERROR("allocating memory for table of size %d bytes failed\n",
+			  control->tbl_hdr.tbl_size);
+		res = -ENOMEM;
+		goto Out;
+	}
 
-	/*
-	 * TODO Currently makes EEPROM writes for each record, this creates
-	 * internal fragmentation. Optimized the code to do full page write of
-	 * 256b
+	down_read(&adev->reset_sem);
+	res = amdgpu_eeprom_read(&adev->pm.smu_i2c,
+				 control->i2c_address +
+				 control->ras_record_offset,
+				 buf, buf_size);
+	up_read(&adev->reset_sem);
+	if (res < 0) {
+		DRM_ERROR("EEPROM failed reading records:%d\n",
+			  res);
+		goto Out;
+	} else if (res < buf_size) {
+		DRM_ERROR("EEPROM read %d out of %d bytes\n",
+			  res, buf_size);
+		res = -EIO;
+		goto Out;
+	}
+
+	/* Recalc the checksum.
 	 */
-	for (i = 0; i < num; i++) {
-		buff = &buffs[i * RAS_TABLE_RECORD_SIZE];
-		record = &records[i];
+	csum = 0;
+	for (pp = buf; pp < buf + buf_size; pp++)
+		csum += *pp;
+
+	csum += __calc_hdr_byte_sum(control);
+	/* avoid sign extension when assigning to "checksum" */
+	csum = -csum;
+	control->tbl_hdr.checksum = csum;
+	res = __write_table_header(control);
+Out:
+	kfree(buf);
+	return res;
+}
 
-		control->next_addr = __correct_eeprom_dest_address(control->next_addr);
+/**
+ * amdgpu_ras_eeprom_append -- append records to the EEPROM RAS table
+ * @control: pointer to control structure
+ * @record: array of records to append
+ * @num: number of records in @record array
+ *
+ * Append @num records to the table, calculate the checksum and write
+ * the table back to EEPROM. The maximum number of records that
+ * can be appended is between 1 and control->ras_max_record_count,
+ * regardless of how many records are already stored in the table.
+ *
+ * Return 0 on success or if EEPROM is not supported, -errno on error.
+ */
+int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
+			     struct eeprom_table_record *record,
+			     const u32 num)
+{
+	struct amdgpu_device *adev = to_amdgpu_device(control);
+	int res;
 
-		/* EEPROM table content is stored in LE format */
-		if (write)
-			__encode_table_record_to_buff(control, record, buff);
+	if (!__is_ras_eeprom_supported(adev))
+		return 0;
 
-		/* i2c may be unstable in gpu reset */
-		down_read(&adev->reset_sem);
-		ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
-					 control->i2c_address + control->next_addr,
-					 buff, RAS_TABLE_RECORD_SIZE, !write);
-		up_read(&adev->reset_sem);
+	if (num == 0) {
+		DRM_ERROR("will not append 0 records\n");
+		return -EINVAL;
+	} else if (num > control->ras_max_record_count) {
+		DRM_ERROR("cannot append %d records than the size of table %d\n",
+			  num, control->ras_max_record_count);
+		return -EINVAL;
+	}
 
-		if (ret < 1) {
-			DRM_ERROR("Failed to process EEPROM table records, ret:%d", ret);
+	mutex_lock(&control->ras_tbl_mutex);
 
-			/* TODO Restore prev next EEPROM address ? */
-			goto free_buff;
-		}
-		/*
-		 * The destination EEPROM address might need to be corrected to account
-		 * for page or entire memory wrapping
-		 */
-		control->next_addr += RAS_TABLE_RECORD_SIZE;
-	}
+	res = amdgpu_ras_eeprom_append_table(control, record, num);
+	if (!res)
+		res = amdgpu_ras_eeprom_update_header(control);
 
-	if (!write) {
-		for (i = 0; i < num; i++) {
-			buff = &buffs[i * RAS_TABLE_RECORD_SIZE];
-			record = &records[i];
+	mutex_unlock(&control->ras_tbl_mutex);
+	return res;
+}
 
-			__decode_table_record_from_buff(control, record, buff);
-		}
-	}
+/**
+ * __amdgpu_ras_eeprom_read -- read indexed from EEPROM into buffer
+ * @control: pointer to control structure
+ * @buf: pointer to buffer to read into
+ * @fri: first record index, start reading at this index, absolute index
+ * @num: number of records to read
+ *
+ * The caller must hold the table mutex in @control.
+ * Return 0 on success, -errno otherwise.
+ */
+static int __amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control *control,
+				    u8 *buf, const u32 fri, const u32 num)
+{
+	struct amdgpu_device *adev = to_amdgpu_device(control);
+	u32 buf_size;
+	int res;
 
-	if (write) {
-		/*
-		 * Update table header with size and CRC and account for table
-		 * wrap around where the assumption is that we treat it as empty
-		 * table
-		 *
-		 * TODO - Check the assumption is correct
+	/* i2c may be unstable in gpu reset */
+	down_read(&adev->reset_sem);
+	buf_size = num * RAS_TABLE_RECORD_SIZE;
+	res = amdgpu_eeprom_read(&adev->pm.smu_i2c,
+				 control->i2c_address +
+				 RAS_INDEX_TO_OFFSET(control, fri),
+				 buf, buf_size);
+	up_read(&adev->reset_sem);
+	if (res < 0) {
+		DRM_ERROR("Reading %d EEPROM table records error:%d",
+			  num, res);
+	} else if (res < buf_size) {
+		/* Short read, return error.
 		 */
-		control->num_recs += num;
-		control->num_recs %= RAS_MAX_RECORD_COUNT;
-		control->tbl_hdr.tbl_size += RAS_TABLE_RECORD_SIZE * num;
-		if (control->tbl_hdr.tbl_size > RAS_TBL_SIZE_BYTES)
-			control->tbl_hdr.tbl_size = RAS_TABLE_HEADER_SIZE +
-			control->num_recs * RAS_TABLE_RECORD_SIZE;
-
-		__update_tbl_checksum(control, records, num);
-		__write_table_header(control, buffs);
-	} else if (!__verify_tbl_checksum(control, records, num)) {
-		DRM_WARN("EEPROM Table checksum mismatch!");
-		/* TODO Uncomment when EEPROM read/write is relliable */
-		/* ret = -EIO; */
+		DRM_ERROR("Read %d records out of %d",
+			  res / RAS_TABLE_RECORD_SIZE, num);
+		res = -EIO;
+	} else {
+		res = 0;
 	}
 
-free_buff:
-	kfree(buffs);
-
-	mutex_unlock(&control->tbl_mutex);
-
-	return ret == num ? 0 : -EIO;
+	return res;
 }
 
+/**
+ * amdgpu_ras_eeprom_read -- read EEPROM
+ * @control: pointer to control structure
+ * @record: array of records to read into
+ * @num: number of records in @record
+ *
+ * Reads num records from the RAS table in EEPROM and
+ * writes the data into @record array.
+ *
+ * Returns 0 on success, -errno on error.
+ */
 int amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control *control,
-			   struct eeprom_table_record *records,
+			   struct eeprom_table_record *record,
 			   const u32 num)
 {
-	return amdgpu_ras_eeprom_xfer(control, records, num, false);
-}
+	struct amdgpu_device *adev = to_amdgpu_device(control);
+	int i, res;
+	u8 *buf, *pp;
+	u32 g0, g1;
 
-int amdgpu_ras_eeprom_write(struct amdgpu_ras_eeprom_control *control,
-			    struct eeprom_table_record *records,
-			    const u32 num)
-{
-	return amdgpu_ras_eeprom_xfer(control, records, num, true);
+	if (!__is_ras_eeprom_supported(adev))
+		return 0;
+
+	if (num == 0) {
+		DRM_ERROR("will not read 0 records\n");
+		return -EINVAL;
+	} else if (num > control->ras_num_recs) {
+		DRM_ERROR("too many records to read:%d available:%d\n",
+			  num, control->ras_num_recs);
+		return -EINVAL;
+	}
+
+	buf = kcalloc(num, RAS_TABLE_RECORD_SIZE, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	/* Determine how many records to read, from the first record
+	 * index, fri, to the end of the table, and from the beginning
+	 * of the table, such that the total number of records is
+	 * @num, and we handle wrap around when fri > 0 and
+	 * fri + num > RAS_MAX_RECORD_COUNT.
+	 *
+	 * First we compute the index of the last element
+	 * which would be fetched from each region,
+	 * g0 is in [fri, fri + num - 1], and
+	 * g1 is in [0, RAS_MAX_RECORD_COUNT - 1].
+	 * Then, if g0 < RAS_MAX_RECORD_COUNT, the index of
+	 * the last element to fetch, we set g0 to _the number_
+	 * of elements to fetch, @num, since we know that the last
+	 * indexed to be fetched does not exceed the table.
+	 *
+	 * If, however, g0 >= RAS_MAX_RECORD_COUNT, then
+	 * we set g0 to the number of elements to read
+	 * until the end of the table, and g1 to the number of
+	 * elements to read from the beginning of the table.
+	 */
+	g0 = control->ras_fri + num - 1;
+	g1 = g0 % control->ras_max_record_count;
+	if (g0 < control->ras_max_record_count) {
+		g0 = num;
+		g1 = 0;
+	} else {
+		g0 = control->ras_max_record_count - control->ras_fri;
+		g1 += 1;
+	}
+
+	mutex_lock(&control->ras_tbl_mutex);
+	res = __amdgpu_ras_eeprom_read(control, buf, control->ras_fri, g0);
+	if (res)
+		goto Out;
+	if (g1) {
+		res = __amdgpu_ras_eeprom_read(control,
+					       buf + g0 * RAS_TABLE_RECORD_SIZE,
+					       0, g1);
+		if (res)
+			goto Out;
+	}
+
+	res = 0;
+
+	/* Read up everything? Then transform.
+	 */
+	pp = buf;
+	for (i = 0; i < num; i++, pp += RAS_TABLE_RECORD_SIZE)
+		__decode_table_record_from_buf(control, &record[i], pp);
+Out:
+	kfree(buf);
+	mutex_unlock(&control->ras_tbl_mutex);
+
+	return res;
 }
 
 inline uint32_t amdgpu_ras_eeprom_max_record_count(void)
@@ -573,35 +734,131 @@ inline uint32_t amdgpu_ras_eeprom_max_record_count(void)
 	return RAS_MAX_RECORD_COUNT;
 }
 
-/* Used for testing if bugs encountered */
-#if 0
-void amdgpu_ras_eeprom_test(struct amdgpu_ras_eeprom_control *control)
+/**
+ * __verify_ras_table_checksum -- verify the RAS EEPROM table checksum
+ * @control: pointer to control structure
+ *
+ * Check the checksum of the stored in EEPROM RAS table.
+ *
+ * Return 0 if the checksum is correct,
+ * positive if it is not correct, and
+ * -errno on I/O error.
+ */
+static int __verify_ras_table_checksum(struct amdgpu_ras_eeprom_control *control)
 {
-	int i;
-	struct eeprom_table_record *recs = kcalloc(1, sizeof(*recs), GFP_KERNEL);
+	struct amdgpu_device *adev = to_amdgpu_device(control);
+	int res;
+	u8  csum, *buf, *pp;
+	u32 buf_size;
+
+	buf_size = RAS_TABLE_HEADER_SIZE +
+		control->ras_num_recs * RAS_TABLE_RECORD_SIZE;
+	buf = kzalloc(buf_size, GFP_KERNEL);
+	if (!buf) {
+		DRM_ERROR("Out of memory checking RAS table checksum.\n");
+		return -ENOMEM;
+	}
 
-	if (!recs)
-		return;
+	res = amdgpu_eeprom_read(&adev->pm.smu_i2c,
+				 control->i2c_address +
+				 control->ras_header_offset,
+				 buf, buf_size);
+	if (res < buf_size) {
+		DRM_ERROR("Partial read for checksum, res:%d\n", res);
+		/* On partial reads, return -EIO.
+		 */
+		if (res >= 0)
+			res = -EIO;
+		goto Out;
+	}
+
+	csum = 0;
+	for (pp = buf; pp < buf + buf_size; pp++)
+		csum += *pp;
+Out:
+	kfree(buf);
+	return res < 0 ? res : csum;
+}
 
-	for (i = 0; i < 1 ; i++) {
-		recs[i].address = 0xdeadbeef;
-		recs[i].retired_page = i;
+int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
+			   bool *exceed_err_limit)
+{
+	struct amdgpu_device *adev = to_amdgpu_device(control);
+	unsigned char buf[RAS_TABLE_HEADER_SIZE] = { 0 };
+	struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr;
+	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
+	int res;
+
+	*exceed_err_limit = false;
+
+	if (!__is_ras_eeprom_supported(adev))
+		return 0;
+
+	/* Verify i2c adapter is initialized */
+	if (!adev->pm.smu_i2c.algo)
+		return -ENOENT;
+
+	if (!__get_eeprom_i2c_addr(adev, control))
+		return -EINVAL;
+
+	control->ras_header_offset = RAS_HDR_START;
+	control->ras_record_offset = RAS_RECORD_START;
+	control->ras_max_record_count  = RAS_MAX_RECORD_COUNT;
+	mutex_init(&control->ras_tbl_mutex);
+
+	/* Read the table header from EEPROM address */
+	res = amdgpu_eeprom_read(&adev->pm.smu_i2c,
+				 control->i2c_address + control->ras_header_offset,
+				 buf, RAS_TABLE_HEADER_SIZE);
+	if (res < RAS_TABLE_HEADER_SIZE) {
+		DRM_ERROR("Failed to read EEPROM table header, res:%d", res);
+		return res >= 0 ? -EIO : res;
 	}
 
-	if (!amdgpu_ras_eeprom_write(control, recs, 1)) {
+	__decode_table_header_from_buf(hdr, buf);
 
-		memset(recs, 0, sizeof(*recs) * 1);
+	control->ras_num_recs = RAS_NUM_RECS(hdr);
+	control->ras_fri = RAS_OFFSET_TO_INDEX(control, hdr->first_rec_offset);
 
-		control->next_addr = RAS_RECORD_START;
+	if (hdr->header == RAS_TABLE_HDR_VAL) {
+		DRM_DEBUG_DRIVER("Found existing EEPROM table with %d records",
+				 control->ras_num_recs);
+		res = __verify_ras_table_checksum(control);
+		if (res)
+			DRM_ERROR("RAS table incorrect checksum or error:%d\n",
+				  res);
+	} else if (hdr->header == RAS_TABLE_HDR_BAD &&
+		   amdgpu_bad_page_threshold != 0) {
+		res = __verify_ras_table_checksum(control);
+		if (res)
+			DRM_ERROR("RAS Table incorrect checksum or error:%d\n",
+				  res);
+		if (ras->bad_page_cnt_threshold > control->ras_num_recs) {
+			/* This means that, the threshold was increased since
+			 * the last time the system was booted, and now,
+			 * ras->bad_page_cnt_threshold - control->num_recs > 0,
+			 * so that at least one more record can be saved,
+			 * before the page count threshold is reached.
+			 */
+			dev_info(adev->dev,
+				 "records:%d threshold:%d, resetting "
+				 "RAS table header signature",
+				 control->ras_num_recs,
+				 ras->bad_page_cnt_threshold);
+			res = amdgpu_ras_eeprom_correct_header_tag(control,
+								   RAS_TABLE_HDR_VAL);
+		} else {
+			*exceed_err_limit = true;
+			dev_err(adev->dev,
+				"RAS records:%d exceed threshold:%d, "
+				"maybe retire this GPU?",
+				control->ras_num_recs, ras->bad_page_cnt_threshold);
+		}
+	} else {
+		DRM_INFO("Creating a new EEPROM table");
 
-		if (!amdgpu_ras_eeprom_read(control, recs)) {
-			for (i = 0; i < 1; i++)
-				DRM_INFO("rec.address :0x%llx, rec.retired_page :%llu",
-					 recs[i].address, recs[i].retired_page);
-		} else
-			DRM_ERROR("Failed in reading from table");
+		res = amdgpu_ras_eeprom_reset_table(control);
+	}
 
-	} else
-		DRM_ERROR("Failed in writing to table");
+	return res < 0 ? res : 0;
 }
-#endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
index 504729b8053759..edb0195ea2eb8c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
@@ -28,7 +28,7 @@
 
 struct amdgpu_device;
 
-enum amdgpu_ras_eeprom_err_type{
+enum amdgpu_ras_eeprom_err_type {
 	AMDGPU_RAS_EEPROM_ERR_PLACE_HOLDER,
 	AMDGPU_RAS_EEPROM_ERR_RECOVERABLE,
 	AMDGPU_RAS_EEPROM_ERR_NON_RECOVERABLE
@@ -40,15 +40,45 @@ struct amdgpu_ras_eeprom_table_header {
 	uint32_t first_rec_offset;
 	uint32_t tbl_size;
 	uint32_t checksum;
-}__attribute__((__packed__));
+} __packed;
 
 struct amdgpu_ras_eeprom_control {
 	struct amdgpu_ras_eeprom_table_header tbl_hdr;
-	u32 i2c_address; /* Base I2C 19-bit memory address */
-	uint32_t next_addr;
-	unsigned int num_recs;
-	struct mutex tbl_mutex;
-	u8 tbl_byte_sum;
+
+	/* Base I2C EEPPROM 19-bit memory address,
+	 * where the table is located. For more information,
+	 * see top of amdgpu_eeprom.c.
+	 */
+	u32 i2c_address;
+
+	/* The byte offset off of @i2c_address
+	 * where the table header is found,
+	 * and where the records start--always
+	 * right after the header.
+	 */
+	u32 ras_header_offset;
+	u32 ras_record_offset;
+
+	/* Number of records in the table.
+	 */
+	u32 ras_num_recs;
+
+	/* First record index to read, 0-based.
+	 * Range is [0, num_recs-1]. This is
+	 * an absolute index, starting right after
+	 * the table header.
+	 */
+	u32 ras_fri;
+
+	/* Maximum possible number of records
+	 * we could store, i.e. the maximum capacity
+	 * of the table.
+	 */
+	u32 ras_max_record_count;
+
+	/* Protect table access via this mutex.
+	 */
+	struct mutex ras_tbl_mutex;
 };
 
 /*
@@ -77,7 +107,8 @@ struct eeprom_table_record {
 }__attribute__((__packed__));
 
 int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
-			bool *exceed_err_limit);
+			   bool *exceed_err_limit);
+
 int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control);
 
 bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev);
@@ -85,11 +116,9 @@ bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev);
 int amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control *control,
 			   struct eeprom_table_record *records, const u32 num);
 
-int amdgpu_ras_eeprom_write(struct amdgpu_ras_eeprom_control *control,
-			    struct eeprom_table_record *records, const u32 num);
+int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
+			     struct eeprom_table_record *records, const u32 num);
 
 inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
 
-void amdgpu_ras_eeprom_test(struct amdgpu_ras_eeprom_control *control);
-
 #endif // _AMDGPU_RAS_EEPROM_H
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 38/40] drm/amdgpu: RAS EEPROM table is now in debugfs
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (36 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 37/40] drm/amdgpu: Optimizations to EEPROM RAS table I/O Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-11 17:16   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 39/40] drm/amdgpu: Fix koops when accessing RAS EEPROM Luben Tuikov
  2021-06-08 21:39 ` [PATCH 40/40] drm/amdgpu: Use a single loop Luben Tuikov
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Andrey Grodzovsky, Xinhui Pan, Luben Tuikov, Alexander Deucher,
	John Clements, Hawking Zhang

Add "ras_eeprom_size" file in debugfs, which
reports the maximum size allocated to the RAS
table in EEROM, as the number of bytes and the
number of records it could store. For instance,

$cat /sys/kernel/debug/dri/0/ras/ras_eeprom_size
262144 bytes or 10921 records
$_

Add "ras_eeprom_table" file in debugfs, which
dumps the RAS table stored EEPROM, in a formatted
way. For instance,

$cat ras_eeprom_table
 Signature    Version  FirstOffs       Size   Checksum
0x414D4452 0x00010000 0x00000014 0x000000EC 0x000000DA
Index  Offset ErrType Bank/CU          TimeStamp      Offs/Addr MemChl MCUMCID    RetiredPage
    0 0x00014      ue    0x00 0x00000000607608DC 0x000000000000   0x00    0x00 0x000000000000
    1 0x0002C      ue    0x00 0x00000000607608DC 0x000000001000   0x00    0x00 0x000000000001
    2 0x00044      ue    0x00 0x00000000607608DC 0x000000002000   0x00    0x00 0x000000000002
    3 0x0005C      ue    0x00 0x00000000607608DC 0x000000003000   0x00    0x00 0x000000000003
    4 0x00074      ue    0x00 0x00000000607608DC 0x000000004000   0x00    0x00 0x000000000004
    5 0x0008C      ue    0x00 0x00000000607608DC 0x000000005000   0x00    0x00 0x000000000005
    6 0x000A4      ue    0x00 0x00000000607608DC 0x000000006000   0x00    0x00 0x000000000006
    7 0x000BC      ue    0x00 0x00000000607608DC 0x000000007000   0x00    0x00 0x000000000007
    8 0x000D4      ue    0x00 0x00000000607608DD 0x000000008000   0x00    0x00 0x000000000008
$_

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Xinhui Pan <xinhui.pan@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |  12 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h       |   1 +
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 241 +++++++++++++++++-
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  10 +-
 4 files changed, 252 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 1424f2cc2076c1..d791a360a92366 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -404,9 +404,9 @@ static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f,
 		/* umc ce/ue error injection for a bad page is not allowed */
 		if ((data.head.block == AMDGPU_RAS_BLOCK__UMC) &&
 		    amdgpu_ras_check_bad_page(adev, data.inject.address)) {
-			dev_warn(adev->dev, "RAS WARN: 0x%llx has been marked "
-					"as bad before error injection!\n",
-					data.inject.address);
+			dev_warn(adev->dev, "RAS WARN: inject: 0x%llx has "
+				 "already been marked as bad!\n",
+				 data.inject.address);
 			break;
 		}
 
@@ -1301,6 +1301,12 @@ static struct dentry *amdgpu_ras_debugfs_create_ctrl_node(struct amdgpu_device *
 			   &con->bad_page_cnt_threshold);
 	debugfs_create_x32("ras_hw_enabled", 0444, dir, &adev->ras_hw_enabled);
 	debugfs_create_x32("ras_enabled", 0444, dir, &adev->ras_enabled);
+	debugfs_create_file("ras_eeprom_size", S_IRUGO, dir, adev,
+			    &amdgpu_ras_debugfs_eeprom_size_ops);
+	con->de_ras_eeprom_table = debugfs_create_file("ras_eeprom_table",
+						       S_IRUGO, dir, adev,
+						       &amdgpu_ras_debugfs_eeprom_table_ops);
+	amdgpu_ras_debugfs_set_ret_size(&con->eeprom_control);
 
 	/*
 	 * After one uncorrectable error happens, usually GPU recovery will
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
index 256cea5d34f2b6..283afd791db107 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
@@ -318,6 +318,7 @@ struct amdgpu_ras {
 	/* sysfs */
 	struct device_attribute features_attr;
 	struct bin_attribute badpages_attr;
+	struct dentry *de_ras_eeprom_table;
 	/* block array */
 	struct ras_manager *objs;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index dc4a845a32404c..677e379f5fb5e9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -27,6 +27,8 @@
 #include <linux/bits.h>
 #include "atom.h"
 #include "amdgpu_eeprom.h"
+#include <linux/debugfs.h>
+#include <linux/uaccess.h>
 
 #define EEPROM_I2C_MADDR_VEGA20         0x0
 #define EEPROM_I2C_MADDR_ARCTURUS       0x40000
@@ -70,6 +72,13 @@
 #define RAS_OFFSET_TO_INDEX(_C, _O) (((_O) - \
 				      (_C)->ras_record_offset) / RAS_TABLE_RECORD_SIZE)
 
+/* Given a 0-based relative record index, 0, 1, 2, ..., etc., off
+ * of "fri", return the absolute record index off of the end of
+ * the table header.
+ */
+#define RAS_RI_TO_AI(_C, _I) (((_I) + (_C)->ras_fri) % \
+			      (_C)->ras_max_record_count)
+
 #define RAS_NUM_RECS(_tbl_hdr)  (((_tbl_hdr)->tbl_size - \
 				  RAS_TABLE_HEADER_SIZE) / RAS_TABLE_RECORD_SIZE)
 
@@ -77,13 +86,10 @@
 
 static bool __is_ras_eeprom_supported(struct amdgpu_device *adev)
 {
-	if ((adev->asic_type == CHIP_VEGA20) ||
-	    (adev->asic_type == CHIP_ARCTURUS) ||
-	    (adev->asic_type == CHIP_SIENNA_CICHLID) ||
-	    (adev->asic_type == CHIP_ALDEBARAN))
-		return true;
-
-	return false;
+	return  adev->asic_type == CHIP_VEGA20 ||
+		adev->asic_type == CHIP_ARCTURUS ||
+		adev->asic_type == CHIP_SIENNA_CICHLID ||
+		adev->asic_type == CHIP_ALDEBARAN;
 }
 
 static bool __get_eeprom_i2c_addr_arct(struct amdgpu_device *adev,
@@ -258,6 +264,8 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control)
 	control->ras_num_recs = 0;
 	control->ras_fri = 0;
 
+	amdgpu_ras_debugfs_set_ret_size(control);
+
 	mutex_unlock(&control->ras_tbl_mutex);
 
 	return res;
@@ -591,6 +599,8 @@ int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
 	res = amdgpu_ras_eeprom_append_table(control, record, num);
 	if (!res)
 		res = amdgpu_ras_eeprom_update_header(control);
+	if (!res)
+		amdgpu_ras_debugfs_set_ret_size(control);
 
 	mutex_unlock(&control->ras_tbl_mutex);
 	return res;
@@ -734,6 +744,223 @@ inline uint32_t amdgpu_ras_eeprom_max_record_count(void)
 	return RAS_MAX_RECORD_COUNT;
 }
 
+static ssize_t
+amdgpu_ras_debugfs_eeprom_size_read(struct file *f, char __user *buf,
+				    size_t size, loff_t *pos)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
+	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
+	struct amdgpu_ras_eeprom_control *control = ras ? &ras->eeprom_control : NULL;
+	u8 data[50];
+	int res;
+
+	if (!size)
+		return size;
+
+	if (!ras || !control) {
+		res = snprintf(data, sizeof(data), "Not supported\n");
+	} else {
+		res = snprintf(data, sizeof(data), "%d bytes or %d records\n",
+			       RAS_TBL_SIZE_BYTES, control->ras_max_record_count);
+	}
+
+	if (*pos >= res)
+		return 0;
+
+	res -= *pos;
+	res = min_t(size_t, res, size);
+
+	if (copy_to_user(buf, &data[*pos], res))
+		return -EINVAL;
+
+	*pos += res;
+
+	return res;
+}
+
+const struct file_operations amdgpu_ras_debugfs_eeprom_size_ops = {
+	.owner = THIS_MODULE,
+	.read = amdgpu_ras_debugfs_eeprom_size_read,
+	.write = NULL,
+	.llseek = default_llseek,
+};
+
+static const char *tbl_hdr_str = " Signature    Version  FirstOffs       Size   Checksum\n";
+static const char *tbl_hdr_fmt = "0x%08X 0x%08X 0x%08X 0x%08X 0x%08X\n";
+#define tbl_hdr_fmt_size (5 * (2+8) + 4 + 1)
+static const char *rec_hdr_str = "Index  Offset ErrType Bank/CU          TimeStamp      Offs/Addr MemChl MCUMCID    RetiredPage\n";
+static const char *rec_hdr_fmt = "%5d 0x%05X %7s    0x%02X 0x%016llX 0x%012llX   0x%02X    0x%02X 0x%012llX\n";
+#define rec_hdr_fmt_size (5 + 1 + 7 + 1 + 7 + 1 + 7 + 1 + 18 + 1 + 14 + 1 + 6 + 1 + 7 + 1 + 14 + 1)
+
+static const char *record_err_type_str[AMDGPU_RAS_EEPROM_ERR_COUNT] = {
+	"ignore",
+	"re",
+	"ue",
+};
+
+static loff_t amdgpu_ras_debugfs_table_size(struct amdgpu_ras_eeprom_control *control)
+{
+	return strlen(tbl_hdr_str) + tbl_hdr_fmt_size +
+		strlen(rec_hdr_str) + rec_hdr_fmt_size * control->ras_num_recs;
+}
+
+void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control *control)
+{
+	struct amdgpu_ras *ras = container_of(control, struct amdgpu_ras,
+					      eeprom_control);
+	struct dentry *de = ras->de_ras_eeprom_table;
+
+	if (de)
+		d_inode(de)->i_size = amdgpu_ras_debugfs_table_size(control);
+}
+
+static ssize_t amdgpu_ras_debugfs_table_read(struct file *f, char __user *buf,
+					     size_t size, loff_t *pos)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
+	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
+	struct amdgpu_ras_eeprom_control *control = &ras->eeprom_control;
+	const size_t orig_size = size;
+	int res = -EINVAL;
+	size_t data_len;
+
+	mutex_lock(&control->ras_tbl_mutex);
+
+	/* We want *pos - data_len > 0, which means there's
+	 * bytes to be printed from data.
+	 */
+	data_len = strlen(tbl_hdr_str);
+	if (*pos < data_len) {
+		data_len -= *pos;
+		data_len = min_t(size_t, data_len, size);
+		if (copy_to_user(buf, &tbl_hdr_str[*pos], data_len))
+			goto Out;
+		buf += data_len;
+		size -= data_len;
+		*pos += data_len;
+	}
+
+	data_len = strlen(tbl_hdr_str) + tbl_hdr_fmt_size;
+	if (*pos < data_len && size > 0) {
+		u8 data[tbl_hdr_fmt_size + 1];
+		loff_t lpos;
+
+		snprintf(data, sizeof(data), tbl_hdr_fmt,
+			 control->tbl_hdr.header,
+			 control->tbl_hdr.version,
+			 control->tbl_hdr.first_rec_offset,
+			 control->tbl_hdr.tbl_size,
+			 control->tbl_hdr.checksum);
+
+		data_len -= *pos;
+		data_len = min_t(size_t, data_len, size);
+		lpos = *pos - strlen(tbl_hdr_str);
+		if (copy_to_user(buf, &data[lpos], data_len))
+			goto Out;
+		buf += data_len;
+		size -= data_len;
+		*pos += data_len;
+	}
+
+	data_len = strlen(tbl_hdr_str) + tbl_hdr_fmt_size + strlen(rec_hdr_str);
+	if (*pos < data_len && size > 0) {
+		loff_t lpos;
+
+		data_len -= *pos;
+		data_len = min_t(size_t, data_len, size);
+		lpos = *pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size;
+		if (copy_to_user(buf, &rec_hdr_str[lpos], data_len))
+			goto Out;
+		buf += data_len;
+		size -= data_len;
+		*pos += data_len;
+	}
+
+	data_len = amdgpu_ras_debugfs_table_size(control);
+	if (*pos < data_len && size > 0) {
+		u8 dare[RAS_TABLE_RECORD_SIZE];
+		u8 data[rec_hdr_fmt_size + 1];
+		/* Find the starting record index
+		 */
+		int s = (*pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size -
+			 strlen(rec_hdr_str)) / rec_hdr_fmt_size;
+		int r = (*pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size -
+			 strlen(rec_hdr_str)) % rec_hdr_fmt_size;
+		struct eeprom_table_record record;
+
+		for ( ; size > 0 && s < control->ras_num_recs; s++) {
+			u32 ai = RAS_RI_TO_AI(control, s);
+			/* Read a single record
+			 */
+			res = __amdgpu_ras_eeprom_read(control, dare, ai, 1);
+			if (res)
+				goto Out;
+			__decode_table_record_from_buf(control, &record, dare);
+			snprintf(data, sizeof(data), rec_hdr_fmt,
+				 s,
+				 RAS_INDEX_TO_OFFSET(control, ai),
+				 record_err_type_str[record.err_type],
+				 record.bank,
+				 record.ts,
+				 record.offset,
+				 record.mem_channel,
+				 record.mcumc_id,
+				 record.retired_page);
+
+			data_len = min_t(size_t, rec_hdr_fmt_size - r, size);
+			if (copy_to_user(buf, &data[r], data_len))
+				return -EINVAL;
+			buf += data_len;
+			size -= data_len;
+			*pos += data_len;
+			r = 0;
+		}
+	}
+	res = 0;
+Out:
+	mutex_unlock(&control->ras_tbl_mutex);
+	return res < 0 ? res : orig_size - size;
+}
+
+static ssize_t
+amdgpu_ras_debugfs_eeprom_table_read(struct file *f, char __user *buf,
+				     size_t size, loff_t *pos)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
+	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
+	struct amdgpu_ras_eeprom_control *control = ras ? &ras->eeprom_control : NULL;
+	u8 data[81];
+	int res;
+
+	if (!size)
+		return size;
+
+	if (!ras || !control) {
+		res = snprintf(data, sizeof(data), "Not supported\n");
+		if (*pos >= res)
+			return 0;
+
+		res -= *pos;
+		res = min_t(size_t, res, size);
+
+		if (copy_to_user(buf, &data[*pos], res))
+			return -EINVAL;
+
+		*pos += res;
+
+		return res;
+	} else {
+		return amdgpu_ras_debugfs_table_read(f, buf, size, pos);
+	}
+}
+
+const struct file_operations amdgpu_ras_debugfs_eeprom_table_ops = {
+	.owner = THIS_MODULE,
+	.read = amdgpu_ras_debugfs_eeprom_table_read,
+	.write = NULL,
+	.llseek = default_llseek,
+};
+
 /**
  * __verify_ras_table_checksum -- verify the RAS EEPROM table checksum
  * @control: pointer to control structure
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
index edb0195ea2eb8c..430e08ab3313a2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
@@ -29,9 +29,10 @@
 struct amdgpu_device;
 
 enum amdgpu_ras_eeprom_err_type {
-	AMDGPU_RAS_EEPROM_ERR_PLACE_HOLDER,
+	AMDGPU_RAS_EEPROM_ERR_NA,
 	AMDGPU_RAS_EEPROM_ERR_RECOVERABLE,
-	AMDGPU_RAS_EEPROM_ERR_NON_RECOVERABLE
+	AMDGPU_RAS_EEPROM_ERR_NON_RECOVERABLE,
+	AMDGPU_RAS_EEPROM_ERR_COUNT,
 };
 
 struct amdgpu_ras_eeprom_table_header {
@@ -121,4 +122,9 @@ int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
 
 inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
 
+void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control *control);
+
+extern const struct file_operations amdgpu_ras_debugfs_eeprom_size_ops;
+extern const struct file_operations amdgpu_ras_debugfs_eeprom_table_ops;
+
 #endif // _AMDGPU_RAS_EEPROM_H
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 39/40] drm/amdgpu: Fix koops when accessing RAS EEPROM
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (37 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 38/40] drm/amdgpu: RAS EEPROM table is now in debugfs Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:23   ` Alex Deucher
  2021-06-08 21:39 ` [PATCH 40/40] drm/amdgpu: Use a single loop Luben Tuikov
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alexander Deucher, Luben Tuikov, John Clements, Hawking Zhang

Debugfs RAS EEPROM files are available when
the ASIC supports RAS, and when the debugfs is
enabled, an also when "ras_enable" module
parameter is set to 0. However in this case,
we get a kernel oops when accessing some of
the "ras_..." controls in debugfs. The reason
for this is that struct amdgpu_ras::adev is
unset. This commit sets it, thus enabling access
to those facilities. Note that this facilitates
EEPROM access and not necessarily RAS features or
functionality.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index d791a360a92366..772d87701ad4a8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1947,11 +1947,20 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
 	bool exc_err_limit = false;
 	int ret;
 
-	if (adev->ras_enabled && con)
-		data = &con->eh_data;
-	else
+	if (!con)
+		return 0;
+
+	/* Allow access to RAS EEPROM via debugfs, when the ASIC
+	 * supports RAS and debugfs is enabled, but when
+	 * adev->ras_enabled is unset, i.e. when "ras_enable"
+	 * module parameter is set to 0.
+	 */
+	con->adev = adev;
+
+	if (!adev->ras_enabled)
 		return 0;
 
+	data = &con->eh_data;
 	*data = kmalloc(sizeof(**data), GFP_KERNEL | __GFP_ZERO);
 	if (!*data) {
 		ret = -ENOMEM;
@@ -1961,7 +1970,6 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
 	mutex_init(&con->recovery_lock);
 	INIT_WORK(&con->recovery_work, amdgpu_ras_do_recovery);
 	atomic_set(&con->in_recovery, 0);
-	con->adev = adev;
 
 	max_eeprom_records_count = amdgpu_ras_eeprom_max_record_count();
 	amdgpu_ras_validate_threshold(adev, max_eeprom_records_count);
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 40/40] drm/amdgpu: Use a single loop
  2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
                   ` (38 preceding siblings ...)
  2021-06-08 21:39 ` [PATCH 39/40] drm/amdgpu: Fix koops when accessing RAS EEPROM Luben Tuikov
@ 2021-06-08 21:39 ` Luben Tuikov
  2021-06-10 21:25   ` Alex Deucher
  39 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-08 21:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alexander Deucher, Andrey Grodzovsky, Luben Tuikov

In smu_v11_0_i2c_transmit() use a single loop to
transmit bytes, instead of two nested loops.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c | 72 ++++++++++------------
 1 file changed, 34 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
index 7f48ee020bc03e..751ea2517c4380 100644
--- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
@@ -243,49 +243,45 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
 	/* Clear status bits */
 	smu_v11_0_i2c_clear_status(control);
 
-
 	timeout_counter = jiffies + msecs_to_jiffies(20);
 
 	while (numbytes > 0) {
 		reg = RREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_STATUS);
-		if (REG_GET_FIELD(reg, CKSVII2C_IC_STATUS, TFNF)) {
-			do {
-				reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, DAT, data[bytes_sent]);
-
-				/* Final message, final byte, must
-				 * generate a STOP, to release the
-				 * bus, i.e. don't hold SCL low.
-				 */
-				if (numbytes == 1 && i2c_flag & I2C_M_STOP)
-					reg = REG_SET_FIELD(reg,
-							    CKSVII2C_IC_DATA_CMD,
-							    STOP, 1);
-
-				if (bytes_sent == 0 && i2c_flag & I2C_X_RESTART)
-					reg = REG_SET_FIELD(reg,
-							    CKSVII2C_IC_DATA_CMD,
-							    RESTART, 1);
-
-				/* Write */
-				reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, CMD, 0);
-				WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_DATA_CMD, reg);
-
-				/* Record that the bytes were transmitted */
-				bytes_sent++;
-				numbytes--;
-
-				reg = RREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_STATUS);
-
-			} while (numbytes &&  REG_GET_FIELD(reg, CKSVII2C_IC_STATUS, TFNF));
-		}
+		if (!REG_GET_FIELD(reg, CKSVII2C_IC_STATUS, TFNF)) {
+			/*
+			 * We waited for too long for the transmission
+			 * FIFO to become not-full.  Exit the loop
+			 * with error.
+			 */
+			if (time_after(jiffies, timeout_counter)) {
+				ret |= I2C_SW_TIMEOUT;
+				goto Err;
+			}
+		} else {
+			reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, DAT,
+					    data[bytes_sent]);
 
-		/*
-		 * We waited too long for the transmission FIFO to become not-full.
-		 * Exit the loop with error.
-		 */
-		if (time_after(jiffies, timeout_counter)) {
-			ret |= I2C_SW_TIMEOUT;
-			goto Err;
+			/* Final message, final byte, must generate a
+			 * STOP to release the bus, i.e. don't hold
+			 * SCL low.
+			 */
+			if (numbytes == 1 && i2c_flag & I2C_M_STOP)
+				reg = REG_SET_FIELD(reg,
+						    CKSVII2C_IC_DATA_CMD,
+						    STOP, 1);
+
+			if (bytes_sent == 0 && i2c_flag & I2C_X_RESTART)
+				reg = REG_SET_FIELD(reg,
+						    CKSVII2C_IC_DATA_CMD,
+						    RESTART, 1);
+
+			/* Write */
+			reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, CMD, 0);
+			WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_DATA_CMD, reg);
+
+			/* Record that the bytes were transmitted */
+			bytes_sent++;
+			numbytes--;
 		}
 	}
 
-- 
2.32.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH 13/40] dmr/amdgpu: Add RESTART handling also to smu_v11_0_i2c (VG20)
  2021-06-08 21:39 ` [PATCH 13/40] dmr/amdgpu: Add RESTART handling also to smu_v11_0_i2c (VG20) Luben Tuikov
@ 2021-06-10 20:18   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 20:18 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>
> Also generilize the code to accept and translate to
> HW bits any I2C relvent flags both for read and write.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> index 3193d566f4f87e..5a90d9351b22eb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> +++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> @@ -530,13 +530,11 @@ static bool smu_v11_0_i2c_bus_unlock(struct i2c_adapter *control)
>  /***************************** I2C GLUE ****************************/
>
>  static uint32_t smu_v11_0_i2c_read_data(struct i2c_adapter *control,
> -                                       struct i2c_msg *msg)
> +                                       struct i2c_msg *msg, uint32_t i2c_flag)
>  {
> -       uint32_t  ret = 0;
> +       uint32_t  ret;
>
> -       /* Now read data starting with that address */
> -       ret = smu_v11_0_i2c_receive(control, msg->addr, msg->buf, msg->len,
> -                                   I2C_RESTART);
> +       ret = smu_v11_0_i2c_receive(control, msg->addr, msg->buf, msg->len, i2c_flag);
>
>         if (ret != I2C_OK)
>                 DRM_ERROR("ReadData() - I2C error occurred :%x", ret);
> @@ -545,12 +543,11 @@ static uint32_t smu_v11_0_i2c_read_data(struct i2c_adapter *control,
>  }
>
>  static uint32_t smu_v11_0_i2c_write_data(struct i2c_adapter *control,
> -                                       struct i2c_msg *msg)
> +                                       struct i2c_msg *msg, uint32_t i2c_flag)
>  {
>         uint32_t  ret;
>
> -       /* Send I2C_NO_STOP unless requested to stop. */
> -       ret = smu_v11_0_i2c_transmit(control, msg->addr, msg->buf, msg->len, ((msg->flags & I2C_M_STOP) ? 0 : I2C_NO_STOP));
> +       ret = smu_v11_0_i2c_transmit(control, msg->addr, msg->buf, msg->len, i2c_flag);
>
>         if (ret != I2C_OK)
>                 DRM_ERROR("WriteI2CData() - I2C error occurred :%x", ret);
> @@ -601,12 +598,17 @@ static int smu_v11_0_i2c_xfer(struct i2c_adapter *i2c_adap,
>         smu_v11_0_i2c_init(i2c_adap);
>
>         for (i = 0; i < num; i++) {
> +               uint32_t i2c_flag = ((msgs[i].flags & I2C_M_NOSTART) ? 0 : I2C_RESTART) ||
> +                                   (((msgs[i].flags & I2C_M_STOP) ? 0 : I2C_NO_STOP));
> +
>                 if (msgs[i].flags & I2C_M_RD)
>                         ret = smu_v11_0_i2c_read_data(i2c_adap,
> -                                                     msgs + i);
> +                                                     msgs + i,
> +                                                     i2c_flag);
>                 else
>                         ret = smu_v11_0_i2c_write_data(i2c_adap,
> -                                                      msgs + i);
> +                                                      msgs + i,
> +                                                      i2c_flag);
>
>                 if (ret != I2C_OK) {
>                         num = -EIO;
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 14/40] drm/amdgpu: Drop i > 0 restriction for issuing RESTART
  2021-06-08 21:39 ` [PATCH 14/40] drm/amdgpu: Drop i > 0 restriction for issuing RESTART Luben Tuikov
@ 2021-06-10 20:21   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 20:21 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Needs a commit message.  With that fixed:
Acked-by: Alex Deucher <alexander.deucher@amd.com>

>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 2 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 2 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> index 73e261260b76e6..72b02025b07e06 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> @@ -1954,7 +1954,7 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
>                         if ((msg[i].flags & I2C_M_STOP) ||
>                             (!remaining_bytes))
>                                 cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
> -                       if ((i > 0) && (j == 0) && !(msg[i].flags & I2C_M_NOSTART))
> +                       if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
>                                 cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
>                 }
>         }
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index 5dc48e557c2bad..289d09a5d711b9 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -2749,7 +2749,7 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
>                         if ((msg[i].flags & I2C_M_STOP) ||
>                             (!remaining_bytes))
>                                 cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
> -                       if ((i > 0) && (j == 0) && !(msg[i].flags & I2C_M_NOSTART))
> +                       if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
>                                 cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
>                 }
>         }
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> index fdbc54622dbfbf..e8e57462ce9d64 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> @@ -3437,7 +3437,7 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
>                         if ((msg[i].flags & I2C_M_STOP) ||
>                             (!remaining_bytes))
>                                 cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
> -                       if ((i > 0) && (j == 0) && !(msg[i].flags & I2C_M_NOSTART))
> +                       if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
>                                 cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
>                 }
>         }
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 15/40] drm/amdgpu: Send STOP for the last byte of msg only
  2021-06-08 21:39 ` [PATCH 15/40] drm/amdgpu: Send STOP for the last byte of msg only Luben Tuikov
@ 2021-06-10 20:22   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 20:22 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>
> Let's just ignore the I2C_M_STOP hint from upper
> layer for SMU I2C code as there is no clean
> mapping between single per I2C message STOP flag
> at the kernel I2C layer and the SMU, per each byte
> STOP flag. We will just by default set it at the
> end of the SMU I2C message.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Suggested-by: Lazar Lijo <Lijo.Lazar@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 4 ++--
>  drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 4 ++--
>  drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 4 ++--
>  3 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> index 72b02025b07e06..235e83e9f0feb7 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> @@ -1951,9 +1951,9 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
>                                 cmd->CmdConfig |= I2C_CMD_WRITE;
>                                 cmd->RegisterAddr = msg->buf[j];
>                         }
> -                       if ((msg[i].flags & I2C_M_STOP) ||
> -                           (!remaining_bytes))
> +                       if (!remaining_bytes)
>                                 cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
> +
>                         if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
>                                 cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
>                 }
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index 289d09a5d711b9..b94c5a1d3eb756 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -2746,9 +2746,9 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
>                                 cmd->CmdConfig |= I2C_CMD_WRITE;
>                                 cmd->RegisterAddr = msg->buf[j];
>                         }
> -                       if ((msg[i].flags & I2C_M_STOP) ||
> -                           (!remaining_bytes))
> +                       if (!remaining_bytes)
>                                 cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
> +
>                         if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
>                                 cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
>                 }
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> index e8e57462ce9d64..2fa667a86c1a54 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> @@ -3434,9 +3434,9 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
>                                 cmd->CmdConfig |= CMDCONFIG_READWRITE_MASK;
>                                 cmd->ReadWriteData = msg->buf[j];
>                         }
> -                       if ((msg[i].flags & I2C_M_STOP) ||
> -                           (!remaining_bytes))
> +                       if (!remaining_bytes)
>                                 cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
> +
>                         if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
>                                 cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
>                 }
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 16/40] drm/amd/pm: SMU I2C: Return number of messages processed
  2021-06-08 21:39 ` [PATCH 16/40] drm/amd/pm: SMU I2C: Return number of messages processed Luben Tuikov
@ 2021-06-10 20:25   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 20:25 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>
> Fix from number of processed bytes to number of
> processed I2C messages.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 43 +++++++++++--------
>  .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 43 +++++++++++--------
>  .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 43 +++++++++++--------
>  3 files changed, 75 insertions(+), 54 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> index 235e83e9f0feb7..409299a608e1b3 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> @@ -1913,9 +1913,8 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
>         struct smu_table_context *smu_table = &adev->smu.smu_table;
>         struct smu_table *table = &smu_table->driver_table;
>         SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
> -       u16 bytes_to_transfer, remaining_bytes, msg_bytes;
> -       u16 available_bytes = MAX_SW_I2C_COMMANDS;
> -       int i, j, r, c;
> +       short available_bytes = MAX_SW_I2C_COMMANDS;
> +       int i, j, r, c, num_done = 0;
>         u8 slave;
>
>         /* only support a single slave addr per transaction */
> @@ -1923,8 +1922,15 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
>         for (i = 0; i < num; i++) {
>                 if (slave != msgs[i].addr)
>                         return -EINVAL;
> -               bytes_to_transfer += min(msgs[i].len, available_bytes);
> -               available_bytes -= bytes_to_transfer;
> +
> +               available_bytes -= msgs[i].len;
> +               if (available_bytes >= 0) {
> +                       num_done++;
> +               } else {
> +                       /* This message and all the follwing won't be processed */
> +                       available_bytes += msgs[i].len;
> +                       break;
> +               }
>         }
>
>         req = kzalloc(sizeof(*req), GFP_KERNEL);
> @@ -1934,24 +1940,28 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
>         req->I2CcontrollerPort = 1;
>         req->I2CSpeed = I2C_SPEED_FAST_400K;
>         req->SlaveAddress = slave << 1; /* 8 bit addresses */
> -       req->NumCmds = bytes_to_transfer;
> +       req->NumCmds = MAX_SW_I2C_COMMANDS - available_bytes;;
>
> -       remaining_bytes = bytes_to_transfer;
>         c = 0;
> -       for (i = 0; i < num; i++) {
> +       for (i = 0; i < num_done; i++) {
>                 struct i2c_msg *msg = &msgs[i];
>
> -               msg_bytes = min(msg->len, remaining_bytes);
> -               for (j = 0; j < msg_bytes; j++) {
> +               for (j = 0; j < msg->len; j++) {
>                         SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
>
> -                       remaining_bytes--;
>                         if (!(msg[i].flags & I2C_M_RD)) {
>                                 /* write */
>                                 cmd->CmdConfig |= I2C_CMD_WRITE;
>                                 cmd->RegisterAddr = msg->buf[j];
>                         }
> -                       if (!remaining_bytes)
> +
> +                       /*
> +                        * Insert STOP if we are at the last byte of either last
> +                        * message for the transaction or the client explicitly
> +                        * requires a STOP at this particular message.
> +                        */
> +                       if ((j == msg->len -1 ) &&
> +                           ((i == num_done - 1) || (msg[i].flags & I2C_M_STOP)))
>                                 cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
>
>                         if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
> @@ -1964,21 +1974,18 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
>         if (r)
>                 goto fail;
>
> -       remaining_bytes = bytes_to_transfer;
>         c = 0;
> -       for (i = 0; i < num; i++) {
> +       for (i = 0; i < num_done; i++) {
>                 struct i2c_msg *msg = &msgs[i];
>
> -               msg_bytes = min(msg->len, remaining_bytes);
> -               for (j = 0; j < msg_bytes; j++) {
> +               for (j = 0; j < msg->len; j++) {
>                         SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
>
> -                       remaining_bytes--;
>                         if (msg[i].flags & I2C_M_RD)
>                                 msg->buf[j] = cmd->Data;
>                 }
>         }
> -       r = bytes_to_transfer;
> +       r = num_done;
>
>  fail:
>         kfree(req);
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index b94c5a1d3eb756..4010b891f25678 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -2708,9 +2708,8 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
>         struct smu_table_context *smu_table = &adev->smu.smu_table;
>         struct smu_table *table = &smu_table->driver_table;
>         SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
> -       u16 bytes_to_transfer, remaining_bytes, msg_bytes;
> -       u16 available_bytes = MAX_SW_I2C_COMMANDS;
> -       int i, j, r, c;
> +       short available_bytes = MAX_SW_I2C_COMMANDS;
> +       int i, j, r, c, num_done = 0;
>         u8 slave;
>
>         /* only support a single slave addr per transaction */
> @@ -2718,8 +2717,15 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
>         for (i = 0; i < num; i++) {
>                 if (slave != msgs[i].addr)
>                         return -EINVAL;
> -               bytes_to_transfer += min(msgs[i].len, available_bytes);
> -               available_bytes -= bytes_to_transfer;
> +
> +               available_bytes -= msgs[i].len;
> +               if (available_bytes >= 0) {
> +                       num_done++;
> +               } else {
> +                       /* This message and all the follwing won't be processed */
> +                       available_bytes += msgs[i].len;
> +                       break;
> +               }
>         }
>
>         req = kzalloc(sizeof(*req), GFP_KERNEL);
> @@ -2729,24 +2735,28 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
>         req->I2CcontrollerPort = 1;
>         req->I2CSpeed = I2C_SPEED_FAST_400K;
>         req->SlaveAddress = slave << 1; /* 8 bit addresses */
> -       req->NumCmds = bytes_to_transfer;
> +       req->NumCmds = MAX_SW_I2C_COMMANDS - available_bytes;;
>
> -       remaining_bytes = bytes_to_transfer;
>         c = 0;
> -       for (i = 0; i < num; i++) {
> +       for (i = 0; i < num_done; i++) {
>                 struct i2c_msg *msg = &msgs[i];
>
> -               msg_bytes = min(msg->len, remaining_bytes);
> -               for (j = 0; j < msg_bytes; j++) {
> +               for (j = 0; j < msg->len; j++) {
>                         SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
>
> -                       remaining_bytes--;
>                         if (!(msg[i].flags & I2C_M_RD)) {
>                                 /* write */
>                                 cmd->CmdConfig |= I2C_CMD_WRITE;
>                                 cmd->RegisterAddr = msg->buf[j];
>                         }
> -                       if (!remaining_bytes)
> +
> +                       /*
> +                        * Insert STOP if we are at the last byte of either last
> +                        * message for the transaction or the client explicitly
> +                        * requires a STOP at this particular message.
> +                        */
> +                       if ((j == msg->len -1 ) &&
> +                           ((i == num_done - 1) || (msg[i].flags & I2C_M_STOP)))
>                                 cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
>
>                         if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
> @@ -2759,21 +2769,18 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
>         if (r)
>                 goto fail;
>
> -       remaining_bytes = bytes_to_transfer;
>         c = 0;
> -       for (i = 0; i < num; i++) {
> +       for (i = 0; i < num_done; i++) {
>                 struct i2c_msg *msg = &msgs[i];
>
> -               msg_bytes = min(msg->len, remaining_bytes);
> -               for (j = 0; j < msg_bytes; j++) {
> +               for (j = 0; j < msg->len; j++) {
>                         SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
>
> -                       remaining_bytes--;
>                         if (msg[i].flags & I2C_M_RD)
>                                 msg->buf[j] = cmd->Data;
>                 }
>         }
> -       r = bytes_to_transfer;
> +       r = num_done;
>
>  fail:
>         kfree(req);
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> index 2fa667a86c1a54..d5b750d84112fa 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> @@ -3396,9 +3396,8 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
>         struct smu_table_context *smu_table = &adev->smu.smu_table;
>         struct smu_table *table = &smu_table->driver_table;
>         SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
> -       u16 bytes_to_transfer, remaining_bytes, msg_bytes;
> -       u16 available_bytes = MAX_SW_I2C_COMMANDS;
> -       int i, j, r, c;
> +       short available_bytes = MAX_SW_I2C_COMMANDS;
> +       int i, j, r, c, num_done = 0;
>         u8 slave;
>
>         /* only support a single slave addr per transaction */
> @@ -3406,8 +3405,15 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
>         for (i = 0; i < num; i++) {
>                 if (slave != msgs[i].addr)
>                         return -EINVAL;
> -               bytes_to_transfer += min(msgs[i].len, available_bytes);
> -               available_bytes -= bytes_to_transfer;
> +
> +               available_bytes -= msgs[i].len;
> +               if (available_bytes >= 0) {
> +                       num_done++;
> +               } else {
> +                       /* This message and all the follwing won't be processed */
> +                       available_bytes += msgs[i].len;
> +                       break;
> +               }
>         }
>
>         req = kzalloc(sizeof(*req), GFP_KERNEL);
> @@ -3417,24 +3423,28 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
>         req->I2CcontrollerPort = 1;
>         req->I2CSpeed = I2C_SPEED_FAST_400K;
>         req->SlaveAddress = slave << 1; /* 8 bit addresses */
> -       req->NumCmds = bytes_to_transfer;
> +       req->NumCmds = MAX_SW_I2C_COMMANDS - available_bytes;;
>
> -       remaining_bytes = bytes_to_transfer;
>         c = 0;
> -       for (i = 0; i < num; i++) {
> +       for (i = 0; i < num_done; i++) {
>                 struct i2c_msg *msg = &msgs[i];
>
> -               msg_bytes = min(msg->len, remaining_bytes);
> -               for (j = 0; j < msg_bytes; j++) {
> +               for (j = 0; j < msg->len; j++) {
>                         SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
>
> -                       remaining_bytes--;
>                         if (!(msg[i].flags & I2C_M_RD)) {
>                                 /* write */
>                                 cmd->CmdConfig |= CMDCONFIG_READWRITE_MASK;
>                                 cmd->ReadWriteData = msg->buf[j];
>                         }
> -                       if (!remaining_bytes)
> +
> +                       /*
> +                        * Insert STOP if we are at the last byte of either last
> +                        * message for the transaction or the client explicitly
> +                        * requires a STOP at this particular message.
> +                        */
> +                       if ((j == msg->len -1 ) &&
> +                           ((i == num_done - 1) || (msg[i].flags & I2C_M_STOP)))
>                                 cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
>
>                         if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
> @@ -3447,21 +3457,18 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
>         if (r)
>                 goto fail;
>
> -       remaining_bytes = bytes_to_transfer;
>         c = 0;
> -       for (i = 0; i < num; i++) {
> +       for (i = 0; i < num_done; i++) {
>                 struct i2c_msg *msg = &msgs[i];
>
> -               msg_bytes = min(msg->len, remaining_bytes);
> -               for (j = 0; j < msg_bytes; j++) {
> +               for (j = 0; j < msg->len; j++) {
>                         SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
>
> -                       remaining_bytes--;
>                         if (msg[i].flags & I2C_M_RD)
>                                 msg->buf[j] = cmd->ReadWriteData;
>                 }
>         }
> -       r = bytes_to_transfer;
> +       r = num_done;
>
>  fail:
>         kfree(req);
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 17/40] drm/amdgpu/pm: ADD I2C quirk adapter table
  2021-06-08 21:39 ` [PATCH 17/40] drm/amdgpu/pm: ADD I2C quirk adapter table Luben Tuikov
@ 2021-06-10 20:26   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 20:26 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>
> To be used by kernel clients of the adapter.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Suggested-by: Lazar Lijo <Lijo.Lazar@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 7 +++++++
>  drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 6 ++++++
>  drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 6 ++++++
>  3 files changed, 19 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> index 409299a608e1b3..c2d6d7c8129593 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> @@ -2004,6 +2004,12 @@ static const struct i2c_algorithm arcturus_i2c_algo = {
>         .functionality = arcturus_i2c_func,
>  };
>
> +
> +static const struct i2c_adapter_quirks arcturus_i2c_control_quirks = {
> +       .max_read_len = MAX_SW_I2C_COMMANDS,
> +       .max_write_len = MAX_SW_I2C_COMMANDS,
> +};
> +
>  static int arcturus_i2c_control_init(struct smu_context *smu, struct i2c_adapter *control)
>  {
>         struct amdgpu_device *adev = to_amdgpu_device(control);
> @@ -2013,6 +2019,7 @@ static int arcturus_i2c_control_init(struct smu_context *smu, struct i2c_adapter
>         control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
>         control->dev.parent = &adev->pdev->dev;
>         control->algo = &arcturus_i2c_algo;
> +       control->quirks = &arcturus_i2c_control_quirks;
>         snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
>
>         res = i2c_add_adapter(control);
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index 4010b891f25678..56000463f64e45 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -2799,6 +2799,11 @@ static const struct i2c_algorithm navi10_i2c_algo = {
>         .functionality = navi10_i2c_func,
>  };
>
> +static const struct i2c_adapter_quirks navi10_i2c_control_quirks = {
> +       .max_read_len = MAX_SW_I2C_COMMANDS,
> +       .max_write_len = MAX_SW_I2C_COMMANDS,
> +};
> +
>  static int navi10_i2c_control_init(struct smu_context *smu, struct i2c_adapter *control)
>  {
>         struct amdgpu_device *adev = to_amdgpu_device(control);
> @@ -2809,6 +2814,7 @@ static int navi10_i2c_control_init(struct smu_context *smu, struct i2c_adapter *
>         control->dev.parent = &adev->pdev->dev;
>         control->algo = &navi10_i2c_algo;
>         snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
> +       control->quirks = &navi10_i2c_control_quirks;
>
>         res = i2c_add_adapter(control);
>         if (res)
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> index d5b750d84112fa..86804f3b0a951b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> @@ -3487,6 +3487,11 @@ static const struct i2c_algorithm sienna_cichlid_i2c_algo = {
>         .functionality = sienna_cichlid_i2c_func,
>  };
>
> +static const struct i2c_adapter_quirks sienna_cichlid_i2c_control_quirks = {
> +       .max_read_len = MAX_SW_I2C_COMMANDS,
> +       .max_write_len = MAX_SW_I2C_COMMANDS,
> +};
> +
>  static int sienna_cichlid_i2c_control_init(struct smu_context *smu, struct i2c_adapter *control)
>  {
>         struct amdgpu_device *adev = to_amdgpu_device(control);
> @@ -3497,6 +3502,7 @@ static int sienna_cichlid_i2c_control_init(struct smu_context *smu, struct i2c_a
>         control->dev.parent = &adev->pdev->dev;
>         control->algo = &sienna_cichlid_i2c_algo;
>         snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
> +       control->quirks = &sienna_cichlid_i2c_control_quirks;
>
>         res = i2c_add_adapter(control);
>         if (res)
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 18/40] drm/amdgpu: Fix Vega20 I2C to be agnostic (v2)
  2021-06-08 21:39 ` [PATCH 18/40] drm/amdgpu: Fix Vega20 I2C to be agnostic (v2) Luben Tuikov
@ 2021-06-10 20:43   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 20:43 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> Teach Vega20 I2C to be agnostic. Allow addressing
> different devices while the master holds the bus.
> Set STOP as per the controller's specification.
>
> v2: Qualify generating ReSTART before the 1st byte
>     of the message, when set by the caller, as
>     those functions are separated, as caught by
>     Andrey G.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c |   4 +-
>  drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c | 105 +++++++++++++--------
>  2 files changed, 69 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> index fe0e9b0c4d5a38..d02ea083a6c69b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> @@ -41,10 +41,10 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>                 },
>                 {
>                         .addr = slave_addr,
> -                       .flags = read ? I2C_M_RD: 0,
> +                       .flags = read ? I2C_M_RD : 0,
>                         .len = bytes,
>                         .buf = eeprom_buf,
> -               }
> +               },
>         };
>         int r;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> index 5a90d9351b22eb..b8d6d308fb06a0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> +++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> @@ -41,9 +41,7 @@
>  #define I2C_SW_TIMEOUT        8
>  #define I2C_ABORT             0x10
>
> -/* I2C transaction flags */
> -#define I2C_NO_STOP    1
> -#define I2C_RESTART    2
> +#define I2C_X_RESTART         BIT(31)
>
>  #define to_amdgpu_device(x) (container_of(x, struct amdgpu_device, pm.smu_i2c))
>
> @@ -205,9 +203,6 @@ static uint32_t smu_v11_0_i2c_poll_rx_status(struct i2c_adapter *control)
>         return ret;
>  }
>
> -
> -
> -
>  /**
>   * smu_v11_0_i2c_transmit - Send a block of data over the I2C bus to a slave device.
>   *
> @@ -252,21 +247,22 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
>                 reg = RREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_STATUS);
>                 if (REG_GET_FIELD(reg, CKSVII2C_IC_STATUS, TFNF)) {
>                         do {
> -                               reg = 0;
> -                               /*
> -                                * Prepare transaction, no need to set RESTART. I2C engine will send
> -                                * START as soon as it sees data in TXFIFO
> -                                */
> -                               if (bytes_sent == 0)
> -                                       reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, RESTART,
> -                                                           (i2c_flag & I2C_RESTART) ? 1 : 0);
>                                 reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, DAT, data[bytes_sent]);
>
> -                               /* determine if we need to send STOP bit or not */
> -                               if (numbytes == 1)
> -                                       /* Final transaction, so send stop unless I2C_NO_STOP */
> -                                       reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, STOP,
> -                                                           (i2c_flag & I2C_NO_STOP) ? 0 : 1);
> +                               /* Final message, final byte, must
> +                                * generate a STOP, to release the
> +                                * bus, i.e. don't hold SCL low.
> +                                */
> +                               if (numbytes == 1 && i2c_flag & I2C_M_STOP)
> +                                       reg = REG_SET_FIELD(reg,
> +                                                           CKSVII2C_IC_DATA_CMD,
> +                                                           STOP, 1);
> +
> +                               if (bytes_sent == 0 && i2c_flag & I2C_X_RESTART)
> +                                       reg = REG_SET_FIELD(reg,
> +                                                           CKSVII2C_IC_DATA_CMD,
> +                                                           RESTART, 1);
> +
>                                 /* Write */
>                                 reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, CMD, 0);
>                                 WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_DATA_CMD, reg);
> @@ -341,23 +337,21 @@ static uint32_t smu_v11_0_i2c_receive(struct i2c_adapter *control,
>
>                 smu_v11_0_i2c_clear_status(control);
>
> -
>                 /* Prepare transaction */
> -
> -               /* Each time we disable I2C, so this is not a restart */
> -               if (bytes_received == 0)
> -                       reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, RESTART,
> -                                           (i2c_flag & I2C_RESTART) ? 1 : 0);
> -
>                 reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, DAT, 0);
>                 /* Read */
>                 reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, CMD, 1);
>
> -               /* Transmitting last byte */
> -               if (numbytes == 1)
> -                       /* Final transaction, so send stop if requested */
> -                       reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, STOP,
> -                                           (i2c_flag & I2C_NO_STOP) ? 0 : 1);
> +               /* Final message, final byte, must generate a STOP
> +                * to release the bus, i.e. don't hold SCL low.
> +                */
> +               if (numbytes == 1 && i2c_flag & I2C_M_STOP)
> +                       reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD,
> +                                           STOP, 1);
> +
> +               if (bytes_received == 0 && i2c_flag & I2C_X_RESTART)
> +                       reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD,
> +                                           RESTART, 1);
>
>                 WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_DATA_CMD, reg);
>
> @@ -591,23 +585,59 @@ static const struct i2c_lock_operations smu_v11_0_i2c_i2c_lock_ops = {
>  };
>
>  static int smu_v11_0_i2c_xfer(struct i2c_adapter *i2c_adap,
> -                             struct i2c_msg *msgs, int num)
> +                             struct i2c_msg *msg, int num)
>  {
>         int i, ret;
> +       u16 addr, dir;
>
>         smu_v11_0_i2c_init(i2c_adap);
>
> +       /* From the client's point of view, this sequence of
> +        * messages-- the array i2c_msg *msg, is a single transaction
> +        * on the bus, starting with START and ending with STOP.
> +        *
> +        * The client is welcome to send any sequence of messages in
> +        * this array, as processing under this function here is
> +        * striving to be agnostic.
> +        *
> +        * Record the first address and direction we see. If either
> +        * changes for a subsequent message, generate ReSTART. The
> +        * DW_apb_i2c databook, v1.21a, specifies that ReSTART is
> +        * generated when the direction changes, with the default IP
> +        * block parameter settings, but it doesn't specify if ReSTART
> +        * is generated when the address changes (possibly...). We
> +        * don't rely on the default IP block parameter settings as
> +        * the block is shared and they may change.
> +        */
> +       if (num > 0) {
> +               addr = msg[0].addr;
> +               dir  = msg[0].flags & I2C_M_RD;
> +       }
> +
>         for (i = 0; i < num; i++) {
> -               uint32_t i2c_flag = ((msgs[i].flags & I2C_M_NOSTART) ? 0 : I2C_RESTART) ||
> -                                   (((msgs[i].flags & I2C_M_STOP) ? 0 : I2C_NO_STOP));
> +               u32 i2c_flag = 0;
>
> -               if (msgs[i].flags & I2C_M_RD)
> +               if (msg[i].addr != addr || (msg[i].flags ^ dir) & I2C_M_RD) {
> +                       addr = msg[i].addr;
> +                       dir  = msg[i].flags & I2C_M_RD;
> +                       i2c_flag |= I2C_X_RESTART;
> +               }
> +
> +               if (i == num - 1) {
> +                       /* Set the STOP bit on the last message, so
> +                        * that the IP block generates a STOP after
> +                        * the last byte of the message.
> +                        */
> +                       i2c_flag |= I2C_M_STOP;
> +               }
> +
> +               if (msg[i].flags & I2C_M_RD)
>                         ret = smu_v11_0_i2c_read_data(i2c_adap,
> -                                                     msgs + i,
> +                                                     msg + i,
>                                                       i2c_flag);
>                 else
>                         ret = smu_v11_0_i2c_write_data(i2c_adap,
> -                                                      msgs + i,
> +                                                      msg + i,
>                                                        i2c_flag);
>
>                 if (ret != I2C_OK) {
> @@ -625,7 +655,6 @@ static u32 smu_v11_0_i2c_func(struct i2c_adapter *adap)
>         return I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL;
>  }
>
> -
>  static const struct i2c_algorithm smu_v11_0_i2c_algo = {
>         .master_xfer = smu_v11_0_i2c_xfer,
>         .functionality = smu_v11_0_i2c_func,
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 19/40] drm/amdgpu: Fixes to the AMDGPU EEPROM driver
  2021-06-08 21:39 ` [PATCH 19/40] drm/amdgpu: Fixes to the AMDGPU EEPROM driver Luben Tuikov
@ 2021-06-10 20:53   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 20:53 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> * When reading from the EEPROM device, there is no
>   device limitation on the number of bytes
>   read--they're simply sequenced out. Thus, read
>   the whole data requested in one go.
>
> * When writing to the EEPROM device, there is a
>   256-byte page limit to write to before having to
>   generate a STOP on the bus, as well as the
>   address written to mustn't cross over the page
>   boundary (it actually rolls over). Maximize the
>   data written to per bus acquisition.
>
> * Return the number of bytes read/written, or -errno.
>
> * Add kernel doc.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c | 96 +++++++++++++++-------
>  1 file changed, 68 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> index d02ea083a6c69b..7fdb5bd2fc8bc8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> @@ -24,59 +24,99 @@
>  #include "amdgpu_eeprom.h"
>  #include "amdgpu.h"
>
> -#define EEPROM_OFFSET_LENGTH 2
> +/* AT24CM02 has a 256-byte write page size.
> + */
> +#define EEPROM_PAGE_BITS   8
> +#define EEPROM_PAGE_SIZE   (1U << EEPROM_PAGE_BITS)
> +#define EEPROM_PAGE_MASK   (EEPROM_PAGE_SIZE - 1)
> +
> +#define EEPROM_OFFSET_SIZE 2
>
> +/**
> + * amdgpu_eeprom_xfer -- Read/write from/to an I2C EEPROM device
> + * @i2c_adap: pointer to the I2C adapter to use
> + * @slave_addr: I2C address of the slave device
> + * @eeprom_addr: EEPROM address from which to read/write
> + * @eeprom_buf: pointer to data buffer to read into/write from
> + * @buf_size: the size of @eeprom_buf
> + * @read: True if reading from the EEPROM, false if writing
> + *
> + * Returns the number of bytes read/written; -errno on error.
> + */
>  int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>                        u16 slave_addr, u16 eeprom_addr,
> -                      u8 *eeprom_buf, u16 bytes, bool read)
> +                      u8 *eeprom_buf, u16 buf_size, bool read)
>  {
> -       u8 eeprom_offset_buf[2];
> -       u16 bytes_transferred;
> +       u8 eeprom_offset_buf[EEPROM_OFFSET_SIZE];
>         struct i2c_msg msgs[] = {
>                 {
>                         .addr = slave_addr,
>                         .flags = 0,
> -                       .len = EEPROM_OFFSET_LENGTH,
> +                       .len = EEPROM_OFFSET_SIZE,
>                         .buf = eeprom_offset_buf,
>                 },
>                 {
>                         .addr = slave_addr,
>                         .flags = read ? I2C_M_RD : 0,
> -                       .len = bytes,
> -                       .buf = eeprom_buf,
>                 },
>         };
> +       const u8 *p = eeprom_buf;
>         int r;
> +       u16 len;
> +
> +       r = 0;
> +       for (len = 0; buf_size > 0;
> +            buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
> +               /* Set the EEPROM address we want to write to/read from.
> +                */
> +               msgs[0].buf[0] = (eeprom_addr >> 8) & 0xff;
> +               msgs[0].buf[1] = eeprom_addr & 0xff;
>
> -       msgs[0].buf[0] = ((eeprom_addr >> 8) & 0xff);
> -       msgs[0].buf[1] = (eeprom_addr & 0xff);
> +               if (!read) {
> +                       /* Write the maximum amount of data, without
> +                        * crossing the device's page boundary, as per
> +                        * its spec. Partial page writes are allowed,
> +                        * starting at any location within the page,
> +                        * so long as the page boundary isn't crossed
> +                        * over (actually the page pointer rolls
> +                        * over).
> +                        *
> +                        * As per the AT24CM02 EEPROM spec, after
> +                        * writing into a page, the I2C driver MUST
> +                        * terminate the transfer, i.e. in
> +                        * "i2c_transfer()" below, with a STOP
> +                        * condition, so that the self-timed write
> +                        * cycle begins. This is implied for the
> +                        * "i2c_transfer()" abstraction.
> +                        */
> +                       len = min(EEPROM_PAGE_SIZE - (eeprom_addr &
> +                                                     EEPROM_PAGE_MASK),
> +                                 (u32)buf_size);
> +               } else {
> +                       /* Reading from the EEPROM has no limitation
> +                        * on the number of bytes read from the EEPROM
> +                        * device--they are simply sequenced out.
> +                        */
> +                       len = buf_size;
> +               }
> +               msgs[1].len = len;
> +               msgs[1].buf = eeprom_buf;
>
> -       while (msgs[1].len) {
>                 r = i2c_transfer(i2c_adap, msgs, ARRAY_SIZE(msgs));
> -               if (r <= 0)
> -                       return r;
> +               if (r < ARRAY_SIZE(msgs))
> +                       break;
>
> -               /* Only for write data */
> -               if (!msgs[1].flags)
> -                       /*
> -                        * According to EEPROM spec there is a MAX of 10 ms required for
> -                        * EEPROM to flush internal RX buffer after STOP was issued at the
> -                        * end of write transaction. During this time the EEPROM will not be
> -                        * responsive to any more commands - so wait a bit more.
> +               if (!read) {
> +                       /* According to the AT24CM02 EEPROM spec the
> +                        * length of the self-writing cycle, tWR, is
> +                        * 10 ms.
>                          *
>                          * TODO Improve to wait for first ACK for slave address after
>                          * internal write cycle done.
>                          */
>                         msleep(10);
> -
> -
> -               bytes_transferred = r - EEPROM_OFFSET_LENGTH;
> -               eeprom_addr += bytes_transferred;
> -               msgs[0].buf[0] = ((eeprom_addr >> 8) & 0xff);
> -               msgs[0].buf[1] = (eeprom_addr & 0xff);
> -               msgs[1].buf += bytes_transferred;
> -               msgs[1].len -= bytes_transferred;
> +               }
>         }
>
> -       return 0;
> +       return r < 0 ? r : eeprom_buf - p;
>  }
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 21/40] drm/amdgpu: I2C EEPROM full memory addressing
  2021-06-08 21:39 ` [PATCH 21/40] drm/amdgpu: I2C EEPROM full memory addressing Luben Tuikov
@ 2021-06-10 20:57   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 20:57 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> * "eeprom_addr" is now 32-bit wide.
> * Remove "slave_addr" from the I2C EEPROM driver
>   interface. The I2C EEPROM Device Type Identifier
>   is fixed at 1010b, and the rest of the bits
>   of the Device Address Byte/Device Select Code,
>   are memory address bits, where the first three
>   of those bits are the hardware selection bits.
>   All this is now a 19-bit address and passed
>   as "eeprom_addr". This abstracts the I2C bus
>   for EEPROM devices for this I2C EEPROM driver.
>   Now clients only pass the 19-bit EEPROM memory
>   address, to the I2C EEPROM driver, as the 32-bit
>   "eeprom_addr", from which they want to read from
>   or write to.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c | 88 +++++++++++++++++-----
>  drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h |  4 +-
>  2 files changed, 72 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> index 94aeda1c7f8ca0..a5a87affedabf1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> @@ -24,7 +24,7 @@
>  #include "amdgpu_eeprom.h"
>  #include "amdgpu.h"
>
> -/* AT24CM02 has a 256-byte write page size.
> +/* AT24CM02 and M24M02-R have a 256-byte write page size.
>   */
>  #define EEPROM_PAGE_BITS   8
>  #define EEPROM_PAGE_SIZE   (1U << EEPROM_PAGE_BITS)
> @@ -32,20 +32,72 @@
>
>  #define EEPROM_OFFSET_SIZE 2
>
> -static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
> -                               u16 slave_addr, u16 eeprom_addr,
> +/* EEPROM memory addresses are 19-bits long, which can
> + * be partitioned into 3, 8, 8 bits, for a total of 19.
> + * The upper 3 bits are sent as part of the 7-bit
> + * "Device Type Identifier"--an I2C concept, which for EEPROM devices
> + * is hard-coded as 1010b, indicating that it is an EEPROM
> + * device--this is the wire format, followed by the upper
> + * 3 bits of the 19-bit address, followed by the direction,
> + * followed by two bytes holding the rest of the 16-bits of
> + * the EEPROM memory address. The format on the wire for EEPROM
> + * devices is: 1010XYZD, A15:A8, A7:A0,
> + * Where D is the direction and sequenced out by the hardware.
> + * Bits XYZ are memory address bits 18, 17 and 16.
> + * These bits are compared to how pins 1-3 of the part are connected,
> + * depending on the size of the part, more on that later.
> + *
> + * Note that of this wire format, a client is in control
> + * of, and needs to specify only XYZ, A15:A8, A7:0, bits,
> + * which is exactly the EEPROM memory address, or offset,
> + * in order to address up to 8 EEPROM devices on the I2C bus.
> + *
> + * For instance, a 2-Mbit I2C EEPROM part, addresses all its bytes,
> + * using an 18-bit address, bit 17 to 0 and thus would use all but one bit of
> + * the 19 bits previously mentioned. The designer would then not connect
> + * pins 1 and 2, and pin 3 usually named "A_2" or "E2", would be connected to
> + * either Vcc or GND. This would allow for up to two 2-Mbit parts on
> + * the same bus, where one would be addressable with bit 18 as 1, and
> + * the other with bit 18 of the address as 0.
> + *
> + * For a 2-Mbit part, bit 18 is usually known as the "Chip Enable" or
> + * "Hardware Address Bit". This bit is compared to the load on pin 3
> + * of the device, described above, and if there is a match, then this
> + * device responds to the command. This way, you can connect two
> + * 2-Mbit EEPROM devices on the same bus, but see one contiguous
> + * memory from 0 to 7FFFFh, where address 0 to 3FFFF is in the device
> + * whose pin 3 is connected to GND, and address 40000 to 7FFFFh is in
> + * the 2nd device, whose pin 3 is connected to Vcc.
> + *
> + * This addressing you encode in the 32-bit "eeprom_addr" below,
> + * namely the 19-bits "XYZ,A15:A0", as a single 19-bit address. For
> + * instance, eeprom_addr = 0x6DA01, is 110_1101_1010_0000_0001, where
> + * XYZ=110b, and A15:A0=DA01h. The XYZ bits become part of the device
> + * address, and the rest of the address bits are sent as the memory
> + * address bytes.
> + *
> + * That is, for an I2C EEPROM driver everything is controlled by
> + * the "eeprom_addr".
> + *
> + * P.S. If you need to write, lock and read the Identification Page,
> + * (M24M02-DR device only, which we do not use), change the "7" to
> + * "0xF" in the macro below, and let the client set bit 20 to 1 in
> + * "eeprom_addr", and set A10 to 0 to write into it, and A10 and A1 to
> + * 1 to lock it permanently.
> + */
> +#define MAKE_I2C_ADDR(_aa) ((0xA << 3) | (((_aa) >> 16) & 7))
> +
> +static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
>                                 u8 *eeprom_buf, u16 buf_size, bool read)
>  {
>         u8 eeprom_offset_buf[EEPROM_OFFSET_SIZE];
>         struct i2c_msg msgs[] = {
>                 {
> -                       .addr = slave_addr,
>                         .flags = 0,
>                         .len = EEPROM_OFFSET_SIZE,
>                         .buf = eeprom_offset_buf,
>                 },
>                 {
> -                       .addr = slave_addr,
>                         .flags = read ? I2C_M_RD : 0,
>                 },
>         };
> @@ -58,6 +110,8 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>               buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
>                 /* Set the EEPROM address we want to write to/read from.
>                  */
> +               msgs[0].addr = MAKE_I2C_ADDR(eeprom_addr);
> +               msgs[1].addr = msgs[0].addr;
>                 msgs[0].buf[0] = (eeprom_addr >> 8) & 0xff;
>                 msgs[0].buf[1] = eeprom_addr & 0xff;
>
> @@ -71,7 +125,7 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>                          * over).
>                          *
>                          * As per the AT24CM02 EEPROM spec, after
> -                        * writing into a page, the I2C driver MUST
> +                        * writing into a page, the I2C driver should
>                          * terminate the transfer, i.e. in
>                          * "i2c_transfer()" below, with a STOP
>                          * condition, so that the self-timed write
> @@ -91,17 +145,20 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>                 msgs[1].len = len;
>                 msgs[1].buf = eeprom_buf;
>
> +               /* This constitutes a START-STOP transaction.
> +                */
>                 r = i2c_transfer(i2c_adap, msgs, ARRAY_SIZE(msgs));
>                 if (r < ARRAY_SIZE(msgs))
>                         break;
>
>                 if (!read) {
> -                       /* According to the AT24CM02 EEPROM spec the
> -                        * length of the self-writing cycle, tWR, is
> -                        * 10 ms.
> +                       /* According to EEPROM specs the length of the
> +                        * self-writing cycle, tWR (tW), is 10 ms.
>                          *
> -                        * TODO Improve to wait for first ACK for slave address after
> -                        * internal write cycle done.
> +                        * TODO: Use polling on ACK, aka Acknowledge
> +                        * Polling, to minimize waiting for the
> +                        * internal write cycle to complete, as it is
> +                        * usually smaller than tWR (tW).
>                          */
>                         msleep(10);
>                 }
> @@ -113,7 +170,6 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>  /**
>   * amdgpu_eeprom_xfer -- Read/write from/to an I2C EEPROM device
>   * @i2c_adap: pointer to the I2C adapter to use
> - * @slave_addr: I2C address of the slave device
>   * @eeprom_addr: EEPROM address from which to read/write
>   * @eeprom_buf: pointer to data buffer to read into/write from
>   * @buf_size: the size of @eeprom_buf
> @@ -121,8 +177,7 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>   *
>   * Returns the number of bytes read/written; -errno on error.
>   */
> -int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
> -                      u16 slave_addr, u16 eeprom_addr,
> +int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
>                        u8 *eeprom_buf, u16 buf_size, bool read)
>  {
>         const struct i2c_adapter_quirks *quirks = i2c_adap->quirks;
> @@ -136,7 +191,7 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>                 limit = quirks->max_write_len;
>
>         if (limit == 0) {
> -               return __amdgpu_eeprom_xfer(i2c_adap, slave_addr, eeprom_addr,
> +               return __amdgpu_eeprom_xfer(i2c_adap, eeprom_addr,
>                                             eeprom_buf, buf_size, read);
>         } else if (limit <= EEPROM_OFFSET_SIZE) {
>                 dev_err_ratelimited(&i2c_adap->dev,
> @@ -157,8 +212,7 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>                       buf_size -= ps, eeprom_addr += ps, eeprom_buf += ps) {
>                         ps = min(limit, buf_size);
>
> -                       r = __amdgpu_eeprom_xfer(i2c_adap,
> -                                                slave_addr, eeprom_addr,
> +                       r = __amdgpu_eeprom_xfer(i2c_adap, eeprom_addr,
>                                                  eeprom_buf, ps, read);
>                         if (r < 0)
>                                 return r;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
> index 9301e5678910ad..417472be2712e6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
> @@ -26,9 +26,7 @@
>
>  #include <linux/i2c.h>
>
> -int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
> -                      u16 slave_addr, u16 eeprom_addr,
> +int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
>                        u8 *eeprom_buf, u16 bytes, bool read);
>
> -
>  #endif
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 22/40] drm/amdgpu: RAS and FRU now use 19-bit I2C address
  2021-06-08 21:39 ` [PATCH 22/40] drm/amdgpu: RAS and FRU now use 19-bit I2C address Luben Tuikov
@ 2021-06-10 20:59   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 20:59 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, John Clements, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> Convert RAS and FRU code to use the 19-bit I2C
> memory address and remove all "slave_addr", as
> this is now absolved into the 19-bit address.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: John Clements <john.clements@amd.com>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c    | 19 ++---
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 82 +++++++------------
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  2 +-
>  3 files changed, 39 insertions(+), 64 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> index 2b854bc6ae34bb..69b9559f840ac3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> @@ -29,8 +29,8 @@
>  #include "amdgpu_fru_eeprom.h"
>  #include "amdgpu_eeprom.h"
>
> -#define I2C_PRODUCT_INFO_ADDR          0x56
> -#define I2C_PRODUCT_INFO_OFFSET                0xC0
> +#define FRU_EEPROM_MADDR        0x60000
> +#define I2C_PRODUCT_INFO_OFFSET 0xC0
>
>  static bool is_fru_eeprom_supported(struct amdgpu_device *adev)
>  {
> @@ -62,12 +62,11 @@ static bool is_fru_eeprom_supported(struct amdgpu_device *adev)
>  }
>
>  static int amdgpu_fru_read_eeprom(struct amdgpu_device *adev, uint32_t addrptr,
> -                          unsigned char *buff)
> +                                 unsigned char *buff)
>  {
>         int ret, size;
>
> -       ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, I2C_PRODUCT_INFO_ADDR,
> -                                addrptr, buff, 1, true);
> +       ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, addrptr, buff, 1, true);
>         if (ret < 1) {
>                 DRM_WARN("FRU: Failed to get size field");
>                 return ret;
> @@ -78,8 +77,8 @@ static int amdgpu_fru_read_eeprom(struct amdgpu_device *adev, uint32_t addrptr,
>          */
>         size = buff[0] - I2C_PRODUCT_INFO_OFFSET;
>
> -       ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, I2C_PRODUCT_INFO_ADDR,
> -                                addrptr + 1, buff, size, true);
> +       ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, addrptr + 1, buff, size,
> +                                true);
>         if (ret < 1) {
>                 DRM_WARN("FRU: Failed to get data field");
>                 return ret;
> @@ -91,8 +90,8 @@ static int amdgpu_fru_read_eeprom(struct amdgpu_device *adev, uint32_t addrptr,
>  int amdgpu_fru_get_product_info(struct amdgpu_device *adev)
>  {
>         unsigned char buff[34];
> -       int addrptr, size;
> -       int len;
> +       u32 addrptr;
> +       int size, len;
>
>         if (!is_fru_eeprom_supported(adev))
>                 return 0;
> @@ -115,7 +114,7 @@ int amdgpu_fru_get_product_info(struct amdgpu_device *adev)
>          * Bytes 8-a are all 1-byte and refer to the size of the entire struct,
>          * and the language field, so just start from 0xb, manufacturer size
>          */
> -       addrptr = 0xb;
> +       addrptr = FRU_EEPROM_MADDR + 0xb;
>         size = amdgpu_fru_read_eeprom(adev, addrptr, buff);
>         if (size < 1) {
>                 DRM_ERROR("Failed to read FRU Manufacturer, ret:%d", size);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index 2b981e96ce5b9e..f316fb11b16d9e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -28,11 +28,11 @@
>  #include "atom.h"
>  #include "amdgpu_eeprom.h"
>
> -#define EEPROM_I2C_TARGET_ADDR_VEGA20          0x50
> -#define EEPROM_I2C_TARGET_ADDR_ARCTURUS                0x54
> -#define EEPROM_I2C_TARGET_ADDR_ARCTURUS_D342   0x50
> -#define EEPROM_I2C_TARGET_ADDR_SIENNA_CICHLID   0x50
> -#define EEPROM_I2C_TARGET_ADDR_ALDEBARAN        0x50
> +#define EEPROM_I2C_MADDR_VEGA20         0x0
> +#define EEPROM_I2C_MADDR_ARCTURUS       0x40000
> +#define EEPROM_I2C_MADDR_ARCTURUS_D342  0x0
> +#define EEPROM_I2C_MADDR_SIENNA_CICHLID 0x0
> +#define EEPROM_I2C_MADDR_ALDEBARAN      0x0
>
>  /*
>   * The 2 macros bellow represent the actual size in bytes that
> @@ -58,7 +58,6 @@
>  #define EEPROM_HDR_START 0
>  #define EEPROM_RECORD_START (EEPROM_HDR_START + EEPROM_TABLE_HEADER_SIZE)
>  #define EEPROM_MAX_RECORD_NUM ((EEPROM_SIZE_BYTES - EEPROM_TABLE_HEADER_SIZE) / EEPROM_TABLE_RECORD_SIZE)
> -#define EEPROM_ADDR_MSB_MASK GENMASK(17, 8)
>
>  #define to_amdgpu_device(x) (container_of(x, struct amdgpu_ras, eeprom_control))->adev
>
> @@ -74,43 +73,43 @@ static bool __is_ras_eeprom_supported(struct amdgpu_device *adev)
>  }
>
>  static bool __get_eeprom_i2c_addr_arct(struct amdgpu_device *adev,
> -                                      uint16_t *i2c_addr)
> +                                      struct amdgpu_ras_eeprom_control *control)
>  {
>         struct atom_context *atom_ctx = adev->mode_info.atom_context;
>
> -       if (!i2c_addr || !atom_ctx)
> +       if (!control || !atom_ctx)
>                 return false;
>
>         if (strnstr(atom_ctx->vbios_version,
>                     "D342",
>                     sizeof(atom_ctx->vbios_version)))
> -               *i2c_addr = EEPROM_I2C_TARGET_ADDR_ARCTURUS_D342;
> +               control->i2c_address = EEPROM_I2C_MADDR_ARCTURUS_D342;
>         else
> -               *i2c_addr = EEPROM_I2C_TARGET_ADDR_ARCTURUS;
> +               control->i2c_address = EEPROM_I2C_MADDR_ARCTURUS;
>
>         return true;
>  }
>
>  static bool __get_eeprom_i2c_addr(struct amdgpu_device *adev,
> -                                 uint16_t *i2c_addr)
> +                                 struct amdgpu_ras_eeprom_control *control)
>  {
> -       if (!i2c_addr)
> +       if (!control)
>                 return false;
>
>         switch (adev->asic_type) {
>         case CHIP_VEGA20:
> -               *i2c_addr = EEPROM_I2C_TARGET_ADDR_VEGA20;
> +               control->i2c_address = EEPROM_I2C_MADDR_VEGA20;
>                 break;
>
>         case CHIP_ARCTURUS:
> -               return __get_eeprom_i2c_addr_arct(adev, i2c_addr);
> +               return __get_eeprom_i2c_addr_arct(adev, control);
>
>         case CHIP_SIENNA_CICHLID:
> -               *i2c_addr = EEPROM_I2C_TARGET_ADDR_SIENNA_CICHLID;
> +               control->i2c_address = EEPROM_I2C_MADDR_SIENNA_CICHLID;
>                 break;
>
>         case CHIP_ALDEBARAN:
> -               *i2c_addr = EEPROM_I2C_TARGET_ADDR_ALDEBARAN;
> +               control->i2c_address = EEPROM_I2C_MADDR_ALDEBARAN;
>                 break;
>
>         default:
> @@ -154,8 +153,9 @@ static int __update_table_header(struct amdgpu_ras_eeprom_control *control,
>
>         /* i2c may be unstable in gpu reset */
>         down_read(&adev->reset_sem);
> -       ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, control->i2c_address,
> -                                EEPROM_HDR_START, buff, EEPROM_TABLE_HEADER_SIZE, false);
> +       ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
> +                                control->i2c_address + EEPROM_HDR_START,
> +                                buff, EEPROM_TABLE_HEADER_SIZE, false);
>         up_read(&adev->reset_sem);
>
>         if (ret < 1)
> @@ -277,7 +277,7 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control)
>  }
>
>  int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
> -                       bool *exceed_err_limit)
> +                          bool *exceed_err_limit)
>  {
>         int ret = 0;
>         struct amdgpu_device *adev = to_amdgpu_device(control);
> @@ -294,14 +294,15 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
>         if (!adev->pm.smu_i2c.algo)
>                 return -ENOENT;
>
> -       if (!__get_eeprom_i2c_addr(adev, &control->i2c_address))
> +       if (!__get_eeprom_i2c_addr(adev, control))
>                 return -EINVAL;
>
>         mutex_init(&control->tbl_mutex);
>
>         /* Read/Create table header from EEPROM address 0 */
> -       ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, control->i2c_address,
> -                                EEPROM_HDR_START, buff, EEPROM_TABLE_HEADER_SIZE, true);
> +       ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
> +                                control->i2c_address + EEPROM_HDR_START,
> +                                buff, EEPROM_TABLE_HEADER_SIZE, true);
>         if (ret < 1) {
>                 DRM_ERROR("Failed to read EEPROM table header, ret:%d", ret);
>                 return ret;
> @@ -395,8 +396,6 @@ static void __decode_table_record_from_buff(struct amdgpu_ras_eeprom_control *co
>
>  /*
>   * When reaching end of EEPROM memory jump back to 0 record address
> - * When next record access will go beyond EEPROM page boundary modify bits A17/A8
> - * in I2C selector to go to next page
>   */
>  static uint32_t __correct_eeprom_dest_address(uint32_t curr_address)
>  {
> @@ -409,20 +408,6 @@ static uint32_t __correct_eeprom_dest_address(uint32_t curr_address)
>                 return EEPROM_RECORD_START;
>         }
>
> -       /*
> -        * To check if we overflow page boundary  compare next address with
> -        * current and see if bits 17/8 of the EEPROM address will change
> -        * If they do start from the next 256b page
> -        *
> -        * https://www.st.com/resource/en/datasheet/m24m02-dr.pdf sec. 5.1.2
> -        */
> -       if ((curr_address & EEPROM_ADDR_MSB_MASK) != (next_address & EEPROM_ADDR_MSB_MASK)) {
> -               DRM_DEBUG_DRIVER("Reached end of EEPROM memory page, jumping to next: %lx",
> -                               (next_address & EEPROM_ADDR_MSB_MASK));
> -
> -               return  (next_address & EEPROM_ADDR_MSB_MASK);
> -       }
> -
>         return curr_address;
>  }
>
> @@ -452,22 +437,20 @@ bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev)
>  }
>
>  int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
> -                                           struct eeprom_table_record *records,
> -                                           bool write,
> -                                           int num)
> +                                    struct eeprom_table_record *records,
> +                                    bool write, int num)
>  {
>         int i, ret = 0;
>         unsigned char *buffs, *buff;
>         struct eeprom_table_record *record;
>         struct amdgpu_device *adev = to_amdgpu_device(control);
>         struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
> -       u16 slave_addr;
>
>         if (!__is_ras_eeprom_supported(adev))
>                 return 0;
>
>         buffs = kcalloc(num, EEPROM_ADDRESS_SIZE + EEPROM_TABLE_RECORD_SIZE,
> -                        GFP_KERNEL);
> +                       GFP_KERNEL);
>         if (!buffs)
>                 return -ENOMEM;
>
> @@ -507,22 +490,15 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
>
>                 control->next_addr = __correct_eeprom_dest_address(control->next_addr);
>
> -               /*
> -                * Update bits 16,17 of EEPROM address in I2C address by setting them
> -                * to bits 1,2 of Device address byte
> -                */
> -               slave_addr = control->i2c_address |
> -                       ((control->next_addr & EEPROM_ADDR_MSB_MASK) >> 15);
> -
>                 /* EEPROM table content is stored in LE format */
>                 if (write)
>                         __encode_table_record_to_buff(control, record, buff);
>
>                 /* i2c may be unstable in gpu reset */
>                 down_read(&adev->reset_sem);
> -               ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, slave_addr,
> -                                        control->next_addr, buff,
> -                                        EEPROM_TABLE_RECORD_SIZE, write ? false : true);
> +               ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
> +                                        control->i2c_address + control->next_addr,
> +                                        buff, EEPROM_TABLE_RECORD_SIZE, !write);
>                 up_read(&adev->reset_sem);
>
>                 if (ret < 1) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> index 17872117097455..4c4c3d840a35c5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> @@ -44,11 +44,11 @@ struct amdgpu_ras_eeprom_table_header {
>
>  struct amdgpu_ras_eeprom_control {
>         struct amdgpu_ras_eeprom_table_header tbl_hdr;
> +       u32 i2c_address; /* Base I2C 19-bit memory address */
>         uint32_t next_addr;
>         unsigned int num_recs;
>         struct mutex tbl_mutex;
>         uint32_t tbl_byte_sum;
> -       uint16_t i2c_address; // 8-bit represented address
>  };
>
>  /*
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 23/40] drm/amdgpu: Fix wrap-around bugs in RAS
  2021-06-08 21:39 ` [PATCH 23/40] drm/amdgpu: Fix wrap-around bugs in RAS Luben Tuikov
@ 2021-06-10 21:00   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:00 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> Fix the size of the EEPROM from 256000 bytes
> to 262144 bytes (256 KiB).
>
> Fix a couple or wrap around bugs. If a valid
> value/address is 0 <= addr < size, the inverse of
> this inequality (barring negative values which
> make no sense here) is addr >= size. Fix this in
> the RAS code.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 20 +++++++++----------
>  1 file changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index f316fb11b16d9e..3ef38b90fc3a83 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -52,12 +52,11 @@
>  /* Bad GPU tag ‘BADG’ */
>  #define EEPROM_TABLE_HDR_BAD 0x42414447
>
> -/* Assume 2 Mbit size */
> -#define EEPROM_SIZE_BYTES 256000
> -#define EEPROM_PAGE__SIZE_BYTES 256
> -#define EEPROM_HDR_START 0
> -#define EEPROM_RECORD_START (EEPROM_HDR_START + EEPROM_TABLE_HEADER_SIZE)
> -#define EEPROM_MAX_RECORD_NUM ((EEPROM_SIZE_BYTES - EEPROM_TABLE_HEADER_SIZE) / EEPROM_TABLE_RECORD_SIZE)
> +/* Assume 2-Mbit size */
> +#define EEPROM_SIZE_BYTES       (256 * 1024)
> +#define EEPROM_HDR_START        0
> +#define EEPROM_RECORD_START     (EEPROM_HDR_START + EEPROM_TABLE_HEADER_SIZE)
> +#define EEPROM_MAX_RECORD_NUM   ((EEPROM_SIZE_BYTES - EEPROM_TABLE_HEADER_SIZE) / EEPROM_TABLE_RECORD_SIZE)
>
>  #define to_amdgpu_device(x) (container_of(x, struct amdgpu_ras, eeprom_control))->adev
>
> @@ -402,9 +401,8 @@ static uint32_t __correct_eeprom_dest_address(uint32_t curr_address)
>         uint32_t next_address = curr_address + EEPROM_TABLE_RECORD_SIZE;
>
>         /* When all EEPROM memory used jump back to 0 address */
> -       if (next_address > EEPROM_SIZE_BYTES) {
> -               DRM_INFO("Reached end of EEPROM memory, jumping to 0 "
> -                        "and overriding old record");
> +       if (next_address >= EEPROM_SIZE_BYTES) {
> +               DRM_INFO("Reached end of EEPROM memory, wrap around to 0.");
>                 return EEPROM_RECORD_START;
>         }
>
> @@ -476,7 +474,9 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
>         }
>
>         /* In case of overflow just start from beginning to not lose newest records */
> -       if (write && (control->next_addr + EEPROM_TABLE_RECORD_SIZE * num > EEPROM_SIZE_BYTES))
> +       if (write &&
> +           (control->next_addr +
> +            EEPROM_TABLE_RECORD_SIZE * num >= EEPROM_SIZE_BYTES))
>                 control->next_addr = EEPROM_RECORD_START;
>
>         /*
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 24/40] drm/amdgpu: I2C class is HWMON
  2021-06-08 21:39 ` [PATCH 24/40] drm/amdgpu: I2C class is HWMON Luben Tuikov
@ 2021-06-10 21:02   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:02 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> Set the auto-discoverable class of I2C bus to
> HWMON. Remove SPD.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c              | 2 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 2 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 2 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +-
>  4 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> index b8d6d308fb06a0..e403ba556e5590 100644
> --- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> +++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> @@ -667,7 +667,7 @@ int smu_v11_0_i2c_control_init(struct i2c_adapter *control)
>
>         mutex_init(&adev->pm.smu_i2c_mutex);
>         control->owner = THIS_MODULE;
> -       control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
> +       control->class = I2C_CLASS_HWMON;
>         control->dev.parent = &adev->pdev->dev;
>         control->algo = &smu_v11_0_i2c_algo;
>         snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> index c2d6d7c8129593..974740ac72fded 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> @@ -2016,7 +2016,7 @@ static int arcturus_i2c_control_init(struct smu_context *smu, struct i2c_adapter
>         int res;
>
>         control->owner = THIS_MODULE;
> -       control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
> +       control->class = I2C_CLASS_HWMON;
>         control->dev.parent = &adev->pdev->dev;
>         control->algo = &arcturus_i2c_algo;
>         control->quirks = &arcturus_i2c_control_quirks;
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index 56000463f64e45..8ab06fa87edb04 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -2810,7 +2810,7 @@ static int navi10_i2c_control_init(struct smu_context *smu, struct i2c_adapter *
>         int res;
>
>         control->owner = THIS_MODULE;
> -       control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
> +       control->class = I2C_CLASS_HWMON;
>         control->dev.parent = &adev->pdev->dev;
>         control->algo = &navi10_i2c_algo;
>         snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> index 86804f3b0a951b..91614ae186f7f5 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> @@ -3498,7 +3498,7 @@ static int sienna_cichlid_i2c_control_init(struct smu_context *smu, struct i2c_a
>         int res;
>
>         control->owner = THIS_MODULE;
> -       control->class = I2C_CLASS_SPD | I2C_CLASS_HWMON;
> +       control->class = I2C_CLASS_HWMON;
>         control->dev.parent = &adev->pdev->dev;
>         control->algo = &sienna_cichlid_i2c_algo;
>         snprintf(control->name, sizeof(control->name), "AMDGPU SMU");
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 25/40] drm/amdgpu: RAS: EEPROM --> RAS
  2021-06-08 21:39 ` [PATCH 25/40] drm/amdgpu: RAS: EEPROM --> RAS Luben Tuikov
@ 2021-06-10 21:03   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:03 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> In amdgpu_ras_eeprom.c--the interface from RAS to
> EEPROM, rename macros from EEPROM to RAS, to
> indicate that the quantities and objects are RAS
> specific, not EEPROM. We can decrease the RAS
> table, or put it in different offset of EEPROM as
> needed in the future.
>
> Remove EEPROM_ADDRESS_SIZE macro definition, equal
> to 2, from the file and calculations, as that
> quantity is computed and added on the stack,
> in the lower layer, amdgpu_eeprom_xfer().
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>


Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 103 +++++++++---------
>  1 file changed, 50 insertions(+), 53 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index 3ef38b90fc3a83..d3678706bb736d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -37,26 +37,25 @@
>  /*
>   * The 2 macros bellow represent the actual size in bytes that
>   * those entities occupy in the EEPROM memory.
> - * EEPROM_TABLE_RECORD_SIZE is different than sizeof(eeprom_table_record) which
> + * RAS_TABLE_RECORD_SIZE is different than sizeof(eeprom_table_record) which
>   * uses uint64 to store 6b fields such as retired_page.
>   */
> -#define EEPROM_TABLE_HEADER_SIZE 20
> -#define EEPROM_TABLE_RECORD_SIZE 24
> -
> -#define EEPROM_ADDRESS_SIZE 0x2
> +#define RAS_TABLE_HEADER_SIZE   20
> +#define RAS_TABLE_RECORD_SIZE   24
>
>  /* Table hdr is 'AMDR' */
> -#define EEPROM_TABLE_HDR_VAL 0x414d4452
> -#define EEPROM_TABLE_VER 0x00010000
> +#define RAS_TABLE_HDR_VAL       0x414d4452
> +#define RAS_TABLE_VER           0x00010000
>
>  /* Bad GPU tag ‘BADG’ */
> -#define EEPROM_TABLE_HDR_BAD 0x42414447
> +#define RAS_TABLE_HDR_BAD       0x42414447
>
> -/* Assume 2-Mbit size */
> -#define EEPROM_SIZE_BYTES       (256 * 1024)
> -#define EEPROM_HDR_START        0
> -#define EEPROM_RECORD_START     (EEPROM_HDR_START + EEPROM_TABLE_HEADER_SIZE)
> -#define EEPROM_MAX_RECORD_NUM   ((EEPROM_SIZE_BYTES - EEPROM_TABLE_HEADER_SIZE) / EEPROM_TABLE_RECORD_SIZE)
> +/* Assume 2-Mbit size EEPROM and take up the whole space. */
> +#define RAS_TBL_SIZE_BYTES      (256 * 1024)
> +#define RAS_HDR_START           0
> +#define RAS_RECORD_START        (RAS_HDR_START + RAS_TABLE_HEADER_SIZE)
> +#define RAS_MAX_RECORD_NUM      ((RAS_TBL_SIZE_BYTES - RAS_TABLE_HEADER_SIZE) \
> +                                / RAS_TABLE_RECORD_SIZE)
>
>  #define to_amdgpu_device(x) (container_of(x, struct amdgpu_ras, eeprom_control))->adev
>
> @@ -153,8 +152,8 @@ static int __update_table_header(struct amdgpu_ras_eeprom_control *control,
>         /* i2c may be unstable in gpu reset */
>         down_read(&adev->reset_sem);
>         ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
> -                                control->i2c_address + EEPROM_HDR_START,
> -                                buff, EEPROM_TABLE_HEADER_SIZE, false);
> +                                control->i2c_address + RAS_HDR_START,
> +                                buff, RAS_TABLE_HEADER_SIZE, false);
>         up_read(&adev->reset_sem);
>
>         if (ret < 1)
> @@ -236,11 +235,11 @@ static int amdgpu_ras_eeprom_correct_header_tag(
>                                 struct amdgpu_ras_eeprom_control *control,
>                                 uint32_t header)
>  {
> -       unsigned char buff[EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE];
> +       unsigned char buff[RAS_TABLE_HEADER_SIZE];
>         struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr;
>         int ret = 0;
>
> -       memset(buff, 0, EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE);
> +       memset(buff, 0, RAS_TABLE_HEADER_SIZE);
>
>         mutex_lock(&control->tbl_mutex);
>         hdr->header = header;
> @@ -252,20 +251,20 @@ static int amdgpu_ras_eeprom_correct_header_tag(
>
>  int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control)
>  {
> -       unsigned char buff[EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE] = { 0 };
> +       unsigned char buff[RAS_TABLE_HEADER_SIZE] = { 0 };
>         struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr;
>         int ret = 0;
>
>         mutex_lock(&control->tbl_mutex);
>
> -       hdr->header = EEPROM_TABLE_HDR_VAL;
> -       hdr->version = EEPROM_TABLE_VER;
> -       hdr->first_rec_offset = EEPROM_RECORD_START;
> -       hdr->tbl_size = EEPROM_TABLE_HEADER_SIZE;
> +       hdr->header = RAS_TABLE_HDR_VAL;
> +       hdr->version = RAS_TABLE_VER;
> +       hdr->first_rec_offset = RAS_RECORD_START;
> +       hdr->tbl_size = RAS_TABLE_HEADER_SIZE;
>
>         control->tbl_byte_sum = 0;
>         __update_tbl_checksum(control, NULL, 0, 0);
> -       control->next_addr = EEPROM_RECORD_START;
> +       control->next_addr = RAS_RECORD_START;
>
>         ret = __update_table_header(control, buff);
>
> @@ -280,7 +279,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
>  {
>         int ret = 0;
>         struct amdgpu_device *adev = to_amdgpu_device(control);
> -       unsigned char buff[EEPROM_TABLE_HEADER_SIZE] = { 0 };
> +       unsigned char buff[RAS_TABLE_HEADER_SIZE] = { 0 };
>         struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr;
>         struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
>
> @@ -300,8 +299,8 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
>
>         /* Read/Create table header from EEPROM address 0 */
>         ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
> -                                control->i2c_address + EEPROM_HDR_START,
> -                                buff, EEPROM_TABLE_HEADER_SIZE, true);
> +                                control->i2c_address + RAS_HDR_START,
> +                                buff, RAS_TABLE_HEADER_SIZE, true);
>         if (ret < 1) {
>                 DRM_ERROR("Failed to read EEPROM table header, ret:%d", ret);
>                 return ret;
> @@ -309,22 +308,22 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
>
>         __decode_table_header_from_buff(hdr, &buff[2]);
>
> -       if (hdr->header == EEPROM_TABLE_HDR_VAL) {
> -               control->num_recs = (hdr->tbl_size - EEPROM_TABLE_HEADER_SIZE) /
> -                                   EEPROM_TABLE_RECORD_SIZE;
> +       if (hdr->header == RAS_TABLE_HDR_VAL) {
> +               control->num_recs = (hdr->tbl_size - RAS_TABLE_HEADER_SIZE) /
> +                                   RAS_TABLE_RECORD_SIZE;
>                 control->tbl_byte_sum = __calc_hdr_byte_sum(control);
> -               control->next_addr = EEPROM_RECORD_START;
> +               control->next_addr = RAS_RECORD_START;
>
>                 DRM_DEBUG_DRIVER("Found existing EEPROM table with %d records",
>                                  control->num_recs);
>
> -       } else if ((hdr->header == EEPROM_TABLE_HDR_BAD) &&
> +       } else if ((hdr->header == RAS_TABLE_HDR_BAD) &&
>                         (amdgpu_bad_page_threshold != 0)) {
>                 if (ras->bad_page_cnt_threshold > control->num_recs) {
>                         dev_info(adev->dev, "Using one valid bigger bad page "
>                                 "threshold and correcting eeprom header tag.\n");
>                         ret = amdgpu_ras_eeprom_correct_header_tag(control,
> -                                                       EEPROM_TABLE_HDR_VAL);
> +                                                       RAS_TABLE_HDR_VAL);
>                 } else {
>                         *exceed_err_limit = true;
>                         dev_err(adev->dev, "Exceeding the bad_page_threshold parameter, "
> @@ -398,12 +397,12 @@ static void __decode_table_record_from_buff(struct amdgpu_ras_eeprom_control *co
>   */
>  static uint32_t __correct_eeprom_dest_address(uint32_t curr_address)
>  {
> -       uint32_t next_address = curr_address + EEPROM_TABLE_RECORD_SIZE;
> +       u32 next_address = curr_address + RAS_TABLE_RECORD_SIZE;
>
>         /* When all EEPROM memory used jump back to 0 address */
> -       if (next_address >= EEPROM_SIZE_BYTES) {
> +       if (next_address >= RAS_TBL_SIZE_BYTES) {
>                 DRM_INFO("Reached end of EEPROM memory, wrap around to 0.");
> -               return EEPROM_RECORD_START;
> +               return RAS_RECORD_START;
>         }
>
>         return curr_address;
> @@ -411,7 +410,6 @@ static uint32_t __correct_eeprom_dest_address(uint32_t curr_address)
>
>  bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev)
>  {
> -
>         struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
>
>         if (!__is_ras_eeprom_supported(adev))
> @@ -424,7 +422,7 @@ bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev)
>                 if (!(con->features & BIT(AMDGPU_RAS_BLOCK__UMC)))
>                         return false;
>
> -       if (con->eeprom_control.tbl_hdr.header == EEPROM_TABLE_HDR_BAD) {
> +       if (con->eeprom_control.tbl_hdr.header == RAS_TABLE_HDR_BAD) {
>                 dev_warn(adev->dev, "This GPU is in BAD status.");
>                 dev_warn(adev->dev, "Please retire it or setting one bigger "
>                                 "threshold value when reloading driver.\n");
> @@ -447,8 +445,7 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
>         if (!__is_ras_eeprom_supported(adev))
>                 return 0;
>
> -       buffs = kcalloc(num, EEPROM_ADDRESS_SIZE + EEPROM_TABLE_RECORD_SIZE,
> -                       GFP_KERNEL);
> +       buffs = kcalloc(num, RAS_TABLE_RECORD_SIZE, GFP_KERNEL);
>         if (!buffs)
>                 return -ENOMEM;
>
> @@ -470,14 +467,14 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
>                 dev_warn(adev->dev,
>                         "Saved bad pages(%d) reaches threshold value(%d).\n",
>                         control->num_recs + num, ras->bad_page_cnt_threshold);
> -               control->tbl_hdr.header = EEPROM_TABLE_HDR_BAD;
> +               control->tbl_hdr.header = RAS_TABLE_HDR_BAD;
>         }
>
>         /* In case of overflow just start from beginning to not lose newest records */
>         if (write &&
>             (control->next_addr +
> -            EEPROM_TABLE_RECORD_SIZE * num >= EEPROM_SIZE_BYTES))
> -               control->next_addr = EEPROM_RECORD_START;
> +            RAS_TABLE_RECORD_SIZE * num >= RAS_TBL_SIZE_BYTES))
> +               control->next_addr = RAS_RECORD_START;
>
>         /*
>          * TODO Currently makes EEPROM writes for each record, this creates
> @@ -485,7 +482,7 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
>          * 256b
>          */
>         for (i = 0; i < num; i++) {
> -               buff = &buffs[i * EEPROM_TABLE_RECORD_SIZE];
> +               buff = &buffs[i * RAS_TABLE_RECORD_SIZE];
>                 record = &records[i];
>
>                 control->next_addr = __correct_eeprom_dest_address(control->next_addr);
> @@ -498,7 +495,7 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
>                 down_read(&adev->reset_sem);
>                 ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
>                                          control->i2c_address + control->next_addr,
> -                                        buff, EEPROM_TABLE_RECORD_SIZE, !write);
> +                                        buff, RAS_TABLE_RECORD_SIZE, !write);
>                 up_read(&adev->reset_sem);
>
>                 if (ret < 1) {
> @@ -511,12 +508,12 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
>                  * The destination EEPROM address might need to be corrected to account
>                  * for page or entire memory wrapping
>                  */
> -               control->next_addr += EEPROM_TABLE_RECORD_SIZE;
> +               control->next_addr += RAS_TABLE_RECORD_SIZE;
>         }
>
>         if (!write) {
>                 for (i = 0; i < num; i++) {
> -                       buff = &buffs[i*EEPROM_TABLE_RECORD_SIZE];
> +                       buff = &buffs[i * RAS_TABLE_RECORD_SIZE];
>                         record = &records[i];
>
>                         __decode_table_record_from_buff(control, record, buff);
> @@ -534,11 +531,11 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
>                  * TODO - Check the assumption is correct
>                  */
>                 control->num_recs += num;
> -               control->num_recs %= EEPROM_MAX_RECORD_NUM;
> -               control->tbl_hdr.tbl_size += EEPROM_TABLE_RECORD_SIZE * num;
> -               if (control->tbl_hdr.tbl_size > EEPROM_SIZE_BYTES)
> -                       control->tbl_hdr.tbl_size = EEPROM_TABLE_HEADER_SIZE +
> -                       control->num_recs * EEPROM_TABLE_RECORD_SIZE;
> +               control->num_recs %= RAS_MAX_RECORD_NUM;
> +               control->tbl_hdr.tbl_size += RAS_TABLE_RECORD_SIZE * num;
> +               if (control->tbl_hdr.tbl_size > RAS_TBL_SIZE_BYTES)
> +                       control->tbl_hdr.tbl_size = RAS_TABLE_HEADER_SIZE +
> +                       control->num_recs * RAS_TABLE_RECORD_SIZE;
>
>                 __update_tbl_checksum(control, records, num, old_hdr_byte_sum);
>
> @@ -559,7 +556,7 @@ int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
>
>  inline uint32_t amdgpu_ras_eeprom_get_record_max_length(void)
>  {
> -       return EEPROM_MAX_RECORD_NUM;
> +       return RAS_MAX_RECORD_NUM;
>  }
>
>  /* Used for testing if bugs encountered */
> @@ -581,7 +578,7 @@ void amdgpu_ras_eeprom_test(struct amdgpu_ras_eeprom_control *control)
>
>                 memset(recs, 0, sizeof(*recs) * 1);
>
> -               control->next_addr = EEPROM_RECORD_START;
> +               control->next_addr = RAS_RECORD_START;
>
>                 if (!amdgpu_ras_eeprom_process_recods(control, recs, false, 1)) {
>                         for (i = 0; i < 1; i++)
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 26/40] drm/amdgpu: Rename misspelled function
  2021-06-08 21:39 ` [PATCH 26/40] drm/amdgpu: Rename misspelled function Luben Tuikov
@ 2021-06-10 21:04   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:04 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> Instead of fixing the spelling in
>   amdgpu_ras_eeprom_process_recods(),
> rename it to,
>   amdgpu_ras_eeprom_xfer(),
> to look similar to other I2C and protocol
> transfer (read/write) functions.
>
> Also to keep the column span to within reason by
> using a shorter name.
>
> Change the "num" function parameter from "int" to
> "const u32" since it is the number of items
> (records) to xfer, i.e. their count, which cannot
> be a negative number.
>
> Also swap the order of parameters, keeping the
> pointer to records and their number next to each
> other, while the direction now becomes the last
> parameter.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c        | 11 +++++------
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 10 +++++-----
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h |  7 +++----
>  3 files changed, 13 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index ec936cde272602..beaa1fee7f71f3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1817,10 +1817,10 @@ int amdgpu_ras_save_bad_pages(struct amdgpu_device *adev)
>         save_count = data->count - control->num_recs;
>         /* only new entries are saved */
>         if (save_count > 0) {
> -               if (amdgpu_ras_eeprom_process_recods(control,
> -                                                       &data->bps[control->num_recs],
> -                                                       true,
> -                                                       save_count)) {
> +               if (amdgpu_ras_eeprom_xfer(control,
> +                                          &data->bps[control->num_recs],
> +                                          save_count,
> +                                          true)) {
>                         dev_err(adev->dev, "Failed to save EEPROM table data!");
>                         return -EIO;
>                 }
> @@ -1850,8 +1850,7 @@ static int amdgpu_ras_load_bad_pages(struct amdgpu_device *adev)
>         if (!bps)
>                 return -ENOMEM;
>
> -       if (amdgpu_ras_eeprom_process_recods(control, bps, false,
> -               control->num_recs)) {
> +       if (amdgpu_ras_eeprom_xfer(control, bps, control->num_recs, false)) {
>                 dev_err(adev->dev, "Failed to load EEPROM table records!");
>                 ret = -EIO;
>                 goto out;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index d3678706bb736d..9e3fbc44b4bc4a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -432,9 +432,9 @@ bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev)
>         return false;
>  }
>
> -int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
> -                                    struct eeprom_table_record *records,
> -                                    bool write, int num)
> +int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
> +                          struct eeprom_table_record *records,
> +                          const u32 num, bool write)
>  {
>         int i, ret = 0;
>         unsigned char *buffs, *buff;
> @@ -574,13 +574,13 @@ void amdgpu_ras_eeprom_test(struct amdgpu_ras_eeprom_control *control)
>                 recs[i].retired_page = i;
>         }
>
> -       if (!amdgpu_ras_eeprom_process_recods(control, recs, true, 1)) {
> +       if (!amdgpu_ras_eeprom_xfer(control, recs, 1, true)) {
>
>                 memset(recs, 0, sizeof(*recs) * 1);
>
>                 control->next_addr = RAS_RECORD_START;
>
> -               if (!amdgpu_ras_eeprom_process_recods(control, recs, false, 1)) {
> +               if (!amdgpu_ras_eeprom_xfer(control, recs, 1, false)) {
>                         for (i = 0; i < 1; i++)
>                                 DRM_INFO("rec.address :0x%llx, rec.retired_page :%llu",
>                                          recs[i].address, recs[i].retired_page);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> index 4c4c3d840a35c5..6a1bd527bce57a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> @@ -82,10 +82,9 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control);
>
>  bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev);
>
> -int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control,
> -                                           struct eeprom_table_record *records,
> -                                           bool write,
> -                                           int num);
> +int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
> +                          struct eeprom_table_record *records,
> +                          const u32 num, bool write);
>
>  inline uint32_t amdgpu_ras_eeprom_get_record_max_length(void);
>
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 27/40] drm/amdgpu: RAS xfer to read/write
  2021-06-08 21:39 ` [PATCH 27/40] drm/amdgpu: RAS xfer to read/write Luben Tuikov
@ 2021-06-10 21:05   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:05 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> Wrap amdgpu_ras_eeprom_xfer(..., bool write),
> into amdgpu_ras_eeprom_read() and
> amdgpu_ras_eeprom_write(), as that makes reading
> and understanding the code clearer.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |  9 ++++---
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 24 +++++++++++++++----
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  8 ++++---
>  3 files changed, 28 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index beaa1fee7f71f3..e3ad081eddd40b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1817,10 +1817,9 @@ int amdgpu_ras_save_bad_pages(struct amdgpu_device *adev)
>         save_count = data->count - control->num_recs;
>         /* only new entries are saved */
>         if (save_count > 0) {
> -               if (amdgpu_ras_eeprom_xfer(control,
> -                                          &data->bps[control->num_recs],
> -                                          save_count,
> -                                          true)) {
> +               if (amdgpu_ras_eeprom_write(control,
> +                                           &data->bps[control->num_recs],
> +                                           save_count)) {
>                         dev_err(adev->dev, "Failed to save EEPROM table data!");
>                         return -EIO;
>                 }
> @@ -1850,7 +1849,7 @@ static int amdgpu_ras_load_bad_pages(struct amdgpu_device *adev)
>         if (!bps)
>                 return -ENOMEM;
>
> -       if (amdgpu_ras_eeprom_xfer(control, bps, control->num_recs, false)) {
> +       if (amdgpu_ras_eeprom_read(control, bps, control->num_recs)) {
>                 dev_err(adev->dev, "Failed to load EEPROM table records!");
>                 ret = -EIO;
>                 goto out;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index 9e3fbc44b4bc4a..550a31953d2da1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -432,9 +432,9 @@ bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev)
>         return false;
>  }
>
> -int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
> -                          struct eeprom_table_record *records,
> -                          const u32 num, bool write)
> +static int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
> +                                 struct eeprom_table_record *records,
> +                                 const u32 num, bool write)
>  {
>         int i, ret = 0;
>         unsigned char *buffs, *buff;
> @@ -554,6 +554,20 @@ int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
>         return ret == num ? 0 : -EIO;
>  }
>
> +int amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control *control,
> +                          struct eeprom_table_record *records,
> +                          const u32 num)
> +{
> +       return amdgpu_ras_eeprom_xfer(control, records, num, false);
> +}
> +
> +int amdgpu_ras_eeprom_write(struct amdgpu_ras_eeprom_control *control,
> +                           struct eeprom_table_record *records,
> +                           const u32 num)
> +{
> +       return amdgpu_ras_eeprom_xfer(control, records, num, true);
> +}
> +
>  inline uint32_t amdgpu_ras_eeprom_get_record_max_length(void)
>  {
>         return RAS_MAX_RECORD_NUM;
> @@ -574,13 +588,13 @@ void amdgpu_ras_eeprom_test(struct amdgpu_ras_eeprom_control *control)
>                 recs[i].retired_page = i;
>         }
>
> -       if (!amdgpu_ras_eeprom_xfer(control, recs, 1, true)) {
> +       if (!amdgpu_ras_eeprom_write(control, recs, 1)) {
>
>                 memset(recs, 0, sizeof(*recs) * 1);
>
>                 control->next_addr = RAS_RECORD_START;
>
> -               if (!amdgpu_ras_eeprom_xfer(control, recs, 1, false)) {
> +               if (!amdgpu_ras_eeprom_read(control, recs)) {
>                         for (i = 0; i < 1; i++)
>                                 DRM_INFO("rec.address :0x%llx, rec.retired_page :%llu",
>                                          recs[i].address, recs[i].retired_page);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> index 6a1bd527bce57a..fa9c509a8e2f2b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> @@ -82,9 +82,11 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control);
>
>  bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev);
>
> -int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
> -                          struct eeprom_table_record *records,
> -                          const u32 num, bool write);
> +int amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control *control,
> +                          struct eeprom_table_record *records, const u32 num);
> +
> +int amdgpu_ras_eeprom_write(struct amdgpu_ras_eeprom_control *control,
> +                           struct eeprom_table_record *records, const u32 num);
>
>  inline uint32_t amdgpu_ras_eeprom_get_record_max_length(void);
>
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 28/40] drm/amdgpu: EEPROM: add explicit read and write
  2021-06-08 21:39 ` [PATCH 28/40] drm/amdgpu: EEPROM: add explicit read and write Luben Tuikov
@ 2021-06-10 21:06   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:06 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> Add explicit amdgpu_eeprom_read() and
> amdgpu_eeprom_write() for clarity.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h     | 16 ++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c |  5 ++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 10 +++++-----
>  3 files changed, 23 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
> index 417472be2712e6..966b434f0de2b7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.h
> @@ -29,4 +29,20 @@
>  int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
>                        u8 *eeprom_buf, u16 bytes, bool read);
>
> +static inline int amdgpu_eeprom_read(struct i2c_adapter *i2c_adap,
> +                                    u32 eeprom_addr, u8 *eeprom_buf,
> +                                    u16 bytes)
> +{
> +       return amdgpu_eeprom_xfer(i2c_adap, eeprom_addr, eeprom_buf, bytes,
> +                                 true);
> +}
> +
> +static inline int amdgpu_eeprom_write(struct i2c_adapter *i2c_adap,
> +                                     u32 eeprom_addr, u8 *eeprom_buf,
> +                                     u16 bytes)
> +{
> +       return amdgpu_eeprom_xfer(i2c_adap, eeprom_addr, eeprom_buf, bytes,
> +                                 false);
> +}
> +
>  #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> index 69b9559f840ac3..7709caeb233d67 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> @@ -66,7 +66,7 @@ static int amdgpu_fru_read_eeprom(struct amdgpu_device *adev, uint32_t addrptr,
>  {
>         int ret, size;
>
> -       ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, addrptr, buff, 1, true);
> +       ret = amdgpu_eeprom_read(&adev->pm.smu_i2c, addrptr, buff, 1);
>         if (ret < 1) {
>                 DRM_WARN("FRU: Failed to get size field");
>                 return ret;
> @@ -77,8 +77,7 @@ static int amdgpu_fru_read_eeprom(struct amdgpu_device *adev, uint32_t addrptr,
>          */
>         size = buff[0] - I2C_PRODUCT_INFO_OFFSET;
>
> -       ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c, addrptr + 1, buff, size,
> -                                true);
> +       ret = amdgpu_eeprom_read(&adev->pm.smu_i2c, addrptr + 1, buff, size);
>         if (ret < 1) {
>                 DRM_WARN("FRU: Failed to get data field");
>                 return ret;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index 550a31953d2da1..17cea35275e46c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -151,9 +151,9 @@ static int __update_table_header(struct amdgpu_ras_eeprom_control *control,
>
>         /* i2c may be unstable in gpu reset */
>         down_read(&adev->reset_sem);
> -       ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
> -                                control->i2c_address + RAS_HDR_START,
> -                                buff, RAS_TABLE_HEADER_SIZE, false);
> +       ret = amdgpu_eeprom_write(&adev->pm.smu_i2c,
> +                                 control->i2c_address + RAS_HDR_START,
> +                                 buff, RAS_TABLE_HEADER_SIZE);
>         up_read(&adev->reset_sem);
>
>         if (ret < 1)
> @@ -298,9 +298,9 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
>         mutex_init(&control->tbl_mutex);
>
>         /* Read/Create table header from EEPROM address 0 */
> -       ret = amdgpu_eeprom_xfer(&adev->pm.smu_i2c,
> +       ret = amdgpu_eeprom_read(&adev->pm.smu_i2c,
>                                  control->i2c_address + RAS_HDR_START,
> -                                buff, RAS_TABLE_HEADER_SIZE, true);
> +                                buff, RAS_TABLE_HEADER_SIZE);
>         if (ret < 1) {
>                 DRM_ERROR("Failed to read EEPROM table header, ret:%d", ret);
>                 return ret;
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 29/40] drm/amd/pm: Extend the I2C quirk table
  2021-06-08 21:39 ` [PATCH 29/40] drm/amd/pm: Extend the I2C quirk table Luben Tuikov
@ 2021-06-10 21:07   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:07 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> Extend the I2C quirk table for SMU access
> controlled I2C adapters. Let the kernel I2C layer
> check that the messages all have the same address,
> and that their combined size doesn't exceed the
> maximum size of a SMU software I2C request.
>
> Suggested-by: Jean Delvare <jdelvare@suse.de>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 5 ++++-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 5 ++++-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 5 ++++-
>  3 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> index 974740ac72fded..de8d7513042966 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> @@ -2006,8 +2006,11 @@ static const struct i2c_algorithm arcturus_i2c_algo = {
>
>
>  static const struct i2c_adapter_quirks arcturus_i2c_control_quirks = {
> -       .max_read_len = MAX_SW_I2C_COMMANDS,
> +       .flags = I2C_AQ_COMB | I2C_AQ_COMB_SAME_ADDR,
> +       .max_read_len  = MAX_SW_I2C_COMMANDS,
>         .max_write_len = MAX_SW_I2C_COMMANDS,
> +       .max_comb_1st_msg_len = 2,
> +       .max_comb_2nd_msg_len = MAX_SW_I2C_COMMANDS - 2,
>  };
>
>  static int arcturus_i2c_control_init(struct smu_context *smu, struct i2c_adapter *control)
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index 8ab06fa87edb04..1b8cd3746d0ebc 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -2800,8 +2800,11 @@ static const struct i2c_algorithm navi10_i2c_algo = {
>  };
>
>  static const struct i2c_adapter_quirks navi10_i2c_control_quirks = {
> -       .max_read_len = MAX_SW_I2C_COMMANDS,
> +       .flags = I2C_AQ_COMB | I2C_AQ_COMB_SAME_ADDR,
> +       .max_read_len  = MAX_SW_I2C_COMMANDS,
>         .max_write_len = MAX_SW_I2C_COMMANDS,
> +       .max_comb_1st_msg_len = 2,
> +       .max_comb_2nd_msg_len = MAX_SW_I2C_COMMANDS - 2,
>  };
>
>  static int navi10_i2c_control_init(struct smu_context *smu, struct i2c_adapter *control)
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> index 91614ae186f7f5..b38127f8009d3d 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> @@ -3488,8 +3488,11 @@ static const struct i2c_algorithm sienna_cichlid_i2c_algo = {
>  };
>
>  static const struct i2c_adapter_quirks sienna_cichlid_i2c_control_quirks = {
> -       .max_read_len = MAX_SW_I2C_COMMANDS,
> +       .flags = I2C_AQ_COMB | I2C_AQ_COMB_SAME_ADDR,
> +       .max_read_len  = MAX_SW_I2C_COMMANDS,
>         .max_write_len = MAX_SW_I2C_COMMANDS,
> +       .max_comb_1st_msg_len = 2,
> +       .max_comb_2nd_msg_len = MAX_SW_I2C_COMMANDS - 2,
>  };
>
>  static int sienna_cichlid_i2c_control_init(struct smu_context *smu, struct i2c_adapter *control)
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 30/40] drm/amd/pm: Simplify managed I2C transfer functions
  2021-06-08 21:39 ` [PATCH 30/40] drm/amd/pm: Simplify managed I2C transfer functions Luben Tuikov
@ 2021-06-10 21:08   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:08 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> Now that we have an I2C quirk table for
> SMU-managed I2C controllers, the I2C core does the
> checks for us, so we don't need to do them, and so
> simplify the managed I2C transfer functions.
>
> Also, for Arcturus and Navi10, fix setting the
> command type from "cmd->CmdConfig" to "cmd->Cmd".
> The latter is what appears to be taking in
> the enumeration I2C_CMD_... as an integer,
> not a bit-flag.
>
> For Sienna, the "Cmd" field seems to have been
> eliminated, and command type and flags all live in
> the "CmdConfig" field--this is left untouched.
>
> Fix: Detect and add changing of direction
> bit-flag, as this is necessary for the SMU to
> detect the direction change in the 1-d array of
> data it gets.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 78 ++++++++-----------
>  .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 78 ++++++++-----------
>  .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 76 ++++++++----------
>  3 files changed, 95 insertions(+), 137 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> index de8d7513042966..0db79a5236e1f1 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> @@ -1907,31 +1907,14 @@ static int arcturus_dpm_set_vcn_enable(struct smu_context *smu, bool enable)
>  }
>
>  static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
> -                            struct i2c_msg *msgs, int num)
> +                            struct i2c_msg *msg, int num_msgs)
>  {
>         struct amdgpu_device *adev = to_amdgpu_device(i2c_adap);
>         struct smu_table_context *smu_table = &adev->smu.smu_table;
>         struct smu_table *table = &smu_table->driver_table;
>         SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
> -       short available_bytes = MAX_SW_I2C_COMMANDS;
> -       int i, j, r, c, num_done = 0;
> -       u8 slave;
> -
> -       /* only support a single slave addr per transaction */
> -       slave = msgs[0].addr;
> -       for (i = 0; i < num; i++) {
> -               if (slave != msgs[i].addr)
> -                       return -EINVAL;
> -
> -               available_bytes -= msgs[i].len;
> -               if (available_bytes >= 0) {
> -                       num_done++;
> -               } else {
> -                       /* This message and all the follwing won't be processed */
> -                       available_bytes += msgs[i].len;
> -                       break;
> -               }
> -       }
> +       int i, j, r, c;
> +       u16 dir;
>
>         req = kzalloc(sizeof(*req), GFP_KERNEL);
>         if (!req)
> @@ -1939,33 +1922,38 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
>
>         req->I2CcontrollerPort = 1;
>         req->I2CSpeed = I2C_SPEED_FAST_400K;
> -       req->SlaveAddress = slave << 1; /* 8 bit addresses */
> -       req->NumCmds = MAX_SW_I2C_COMMANDS - available_bytes;;
> -
> -       c = 0;
> -       for (i = 0; i < num_done; i++) {
> -               struct i2c_msg *msg = &msgs[i];
> +       req->SlaveAddress = msg[0].addr << 1; /* wants an 8-bit address */
> +       dir = msg[0].flags & I2C_M_RD;
>
> -               for (j = 0; j < msg->len; j++) {
> -                       SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
> +       for (c = i = 0; i < num_msgs; i++) {
> +               for (j = 0; j < msg[i].len; j++, c++) {
> +                       SwI2cCmd_t *cmd = &req->SwI2cCmds[c];
>
>                         if (!(msg[i].flags & I2C_M_RD)) {
>                                 /* write */
> -                               cmd->CmdConfig |= I2C_CMD_WRITE;
> -                               cmd->RegisterAddr = msg->buf[j];
> +                               cmd->Cmd = I2C_CMD_WRITE;
> +                               cmd->RegisterAddr = msg[i].buf[j];
> +                       }
> +
> +                       if ((dir ^ msg[i].flags) & I2C_M_RD) {
> +                               /* The direction changes.
> +                                */
> +                               dir = msg[i].flags & I2C_M_RD;
> +                               cmd->CmdConfig |= CMDCONFIG_RESTART_MASK;
>                         }
>
> +                       req->NumCmds++;
> +
>                         /*
>                          * Insert STOP if we are at the last byte of either last
>                          * message for the transaction or the client explicitly
>                          * requires a STOP at this particular message.
>                          */
> -                       if ((j == msg->len -1 ) &&
> -                           ((i == num_done - 1) || (msg[i].flags & I2C_M_STOP)))
> +                       if ((j == msg[i].len - 1) &&
> +                           ((i == num_msgs - 1) || (msg[i].flags & I2C_M_STOP))) {
> +                               cmd->CmdConfig &= ~CMDCONFIG_RESTART_MASK;
>                                 cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
> -
> -                       if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
> -                               cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
> +                       }
>                 }
>         }
>         mutex_lock(&adev->smu.mutex);
> @@ -1974,22 +1962,20 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
>         if (r)
>                 goto fail;
>
> -       c = 0;
> -       for (i = 0; i < num_done; i++) {
> -               struct i2c_msg *msg = &msgs[i];
> -
> -               for (j = 0; j < msg->len; j++) {
> -                       SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
> +       for (c = i = 0; i < num_msgs; i++) {
> +               if (!(msg[i].flags & I2C_M_RD)) {
> +                       c += msg[i].len;
> +                       continue;
> +               }
> +               for (j = 0; j < msg[i].len; j++, c++) {
> +                       SwI2cCmd_t *cmd = &res->SwI2cCmds[c];
>
> -                       if (msg[i].flags & I2C_M_RD)
> -                               msg->buf[j] = cmd->Data;
> +                       msg[i].buf[j] = cmd->Data;
>                 }
>         }
> -       r = num_done;
> -
> +       r = num_msgs;
>  fail:
>         kfree(req);
> -
>         return r;
>  }
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index 1b8cd3746d0ebc..2acf54967c6ab1 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -2702,31 +2702,14 @@ static ssize_t navi10_get_legacy_gpu_metrics(struct smu_context *smu,
>  }
>
>  static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
> -                          struct i2c_msg *msgs, int num)
> +                          struct i2c_msg *msg, int num_msgs)
>  {
>         struct amdgpu_device *adev = to_amdgpu_device(i2c_adap);
>         struct smu_table_context *smu_table = &adev->smu.smu_table;
>         struct smu_table *table = &smu_table->driver_table;
>         SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
> -       short available_bytes = MAX_SW_I2C_COMMANDS;
> -       int i, j, r, c, num_done = 0;
> -       u8 slave;
> -
> -       /* only support a single slave addr per transaction */
> -       slave = msgs[0].addr;
> -       for (i = 0; i < num; i++) {
> -               if (slave != msgs[i].addr)
> -                       return -EINVAL;
> -
> -               available_bytes -= msgs[i].len;
> -               if (available_bytes >= 0) {
> -                       num_done++;
> -               } else {
> -                       /* This message and all the follwing won't be processed */
> -                       available_bytes += msgs[i].len;
> -                       break;
> -               }
> -       }
> +       int i, j, r, c;
> +       u16 dir;
>
>         req = kzalloc(sizeof(*req), GFP_KERNEL);
>         if (!req)
> @@ -2734,33 +2717,38 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
>
>         req->I2CcontrollerPort = 1;
>         req->I2CSpeed = I2C_SPEED_FAST_400K;
> -       req->SlaveAddress = slave << 1; /* 8 bit addresses */
> -       req->NumCmds = MAX_SW_I2C_COMMANDS - available_bytes;;
> +       req->SlaveAddress = msg[0].addr << 1; /* wants an 8-bit address */
> +       dir = msg[0].flags & I2C_M_RD;
>
> -       c = 0;
> -       for (i = 0; i < num_done; i++) {
> -               struct i2c_msg *msg = &msgs[i];
> -
> -               for (j = 0; j < msg->len; j++) {
> -                       SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
> +       for (c = i = 0; i < num_msgs; i++) {
> +               for (j = 0; j < msg[i].len; j++, c++) {
> +                       SwI2cCmd_t *cmd = &req->SwI2cCmds[c];
>
>                         if (!(msg[i].flags & I2C_M_RD)) {
>                                 /* write */
> -                               cmd->CmdConfig |= I2C_CMD_WRITE;
> -                               cmd->RegisterAddr = msg->buf[j];
> +                               cmd->Cmd = I2C_CMD_WRITE;
> +                               cmd->RegisterAddr = msg[i].buf[j];
> +                       }
> +
> +                       if ((dir ^ msg[i].flags) & I2C_M_RD) {
> +                               /* The direction changes.
> +                                */
> +                               dir = msg[i].flags & I2C_M_RD;
> +                               cmd->CmdConfig |= CMDCONFIG_RESTART_MASK;
>                         }
>
> +                       req->NumCmds++;
> +
>                         /*
>                          * Insert STOP if we are at the last byte of either last
>                          * message for the transaction or the client explicitly
>                          * requires a STOP at this particular message.
>                          */
> -                       if ((j == msg->len -1 ) &&
> -                           ((i == num_done - 1) || (msg[i].flags & I2C_M_STOP)))
> +                       if ((j == msg[i].len - 1) &&
> +                           ((i == num_msgs - 1) || (msg[i].flags & I2C_M_STOP))) {
> +                               cmd->CmdConfig &= ~CMDCONFIG_RESTART_MASK;
>                                 cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
> -
> -                       if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
> -                               cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
> +                       }
>                 }
>         }
>         mutex_lock(&adev->smu.mutex);
> @@ -2769,22 +2757,20 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
>         if (r)
>                 goto fail;
>
> -       c = 0;
> -       for (i = 0; i < num_done; i++) {
> -               struct i2c_msg *msg = &msgs[i];
> -
> -               for (j = 0; j < msg->len; j++) {
> -                       SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
> +       for (c = i = 0; i < num_msgs; i++) {
> +               if (!(msg[i].flags & I2C_M_RD)) {
> +                       c += msg[i].len;
> +                       continue;
> +               }
> +               for (j = 0; j < msg[i].len; j++, c++) {
> +                       SwI2cCmd_t *cmd = &res->SwI2cCmds[c];
>
> -                       if (msg[i].flags & I2C_M_RD)
> -                               msg->buf[j] = cmd->Data;
> +                       msg[i].buf[j] = cmd->Data;
>                 }
>         }
> -       r = num_done;
> -
> +       r = num_msgs;
>  fail:
>         kfree(req);
> -
>         return r;
>  }
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> index b38127f8009d3d..44ca3b3f83f4d9 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> @@ -3390,31 +3390,14 @@ static void sienna_cichlid_dump_pptable(struct smu_context *smu)
>  }
>
>  static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
> -                                  struct i2c_msg *msgs, int num)
> +                                  struct i2c_msg *msg, int num_msgs)
>  {
>         struct amdgpu_device *adev = to_amdgpu_device(i2c_adap);
>         struct smu_table_context *smu_table = &adev->smu.smu_table;
>         struct smu_table *table = &smu_table->driver_table;
>         SwI2cRequest_t *req, *res = (SwI2cRequest_t *)table->cpu_addr;
> -       short available_bytes = MAX_SW_I2C_COMMANDS;
> -       int i, j, r, c, num_done = 0;
> -       u8 slave;
> -
> -       /* only support a single slave addr per transaction */
> -       slave = msgs[0].addr;
> -       for (i = 0; i < num; i++) {
> -               if (slave != msgs[i].addr)
> -                       return -EINVAL;
> -
> -               available_bytes -= msgs[i].len;
> -               if (available_bytes >= 0) {
> -                       num_done++;
> -               } else {
> -                       /* This message and all the follwing won't be processed */
> -                       available_bytes += msgs[i].len;
> -                       break;
> -               }
> -       }
> +       int i, j, r, c;
> +       u16 dir;
>
>         req = kzalloc(sizeof(*req), GFP_KERNEL);
>         if (!req)
> @@ -3422,33 +3405,38 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
>
>         req->I2CcontrollerPort = 1;
>         req->I2CSpeed = I2C_SPEED_FAST_400K;
> -       req->SlaveAddress = slave << 1; /* 8 bit addresses */
> -       req->NumCmds = MAX_SW_I2C_COMMANDS - available_bytes;;
> +       req->SlaveAddress = msg[0].addr << 1; /* wants an 8-bit address */
> +       dir = msg[0].flags & I2C_M_RD;
>
> -       c = 0;
> -       for (i = 0; i < num_done; i++) {
> -               struct i2c_msg *msg = &msgs[i];
> -
> -               for (j = 0; j < msg->len; j++) {
> -                       SwI2cCmd_t *cmd = &req->SwI2cCmds[c++];
> +       for (c = i = 0; i < num_msgs; i++) {
> +               for (j = 0; j < msg[i].len; j++, c++) {
> +                       SwI2cCmd_t *cmd = &req->SwI2cCmds[c];
>
>                         if (!(msg[i].flags & I2C_M_RD)) {
>                                 /* write */
>                                 cmd->CmdConfig |= CMDCONFIG_READWRITE_MASK;
> -                               cmd->ReadWriteData = msg->buf[j];
> +                               cmd->ReadWriteData = msg[i].buf[j];
> +                       }
> +
> +                       if ((dir ^ msg[i].flags) & I2C_M_RD) {
> +                               /* The direction changes.
> +                                */
> +                               dir = msg[i].flags & I2C_M_RD;
> +                               cmd->CmdConfig |= CMDCONFIG_RESTART_MASK;
>                         }
>
> +                       req->NumCmds++;
> +
>                         /*
>                          * Insert STOP if we are at the last byte of either last
>                          * message for the transaction or the client explicitly
>                          * requires a STOP at this particular message.
>                          */
> -                       if ((j == msg->len -1 ) &&
> -                           ((i == num_done - 1) || (msg[i].flags & I2C_M_STOP)))
> +                       if ((j == msg[i].len - 1) &&
> +                           ((i == num_msgs - 1) || (msg[i].flags & I2C_M_STOP))) {
> +                               cmd->CmdConfig &= ~CMDCONFIG_RESTART_MASK;
>                                 cmd->CmdConfig |= CMDCONFIG_STOP_MASK;
> -
> -                       if ((j == 0) && !(msg[i].flags & I2C_M_NOSTART))
> -                               cmd->CmdConfig |= CMDCONFIG_RESTART_BIT;
> +                       }
>                 }
>         }
>         mutex_lock(&adev->smu.mutex);
> @@ -3457,22 +3445,20 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
>         if (r)
>                 goto fail;
>
> -       c = 0;
> -       for (i = 0; i < num_done; i++) {
> -               struct i2c_msg *msg = &msgs[i];
> -
> -               for (j = 0; j < msg->len; j++) {
> -                       SwI2cCmd_t *cmd = &res->SwI2cCmds[c++];
> +       for (c = i = 0; i < num_msgs; i++) {
> +               if (!(msg[i].flags & I2C_M_RD)) {
> +                       c += msg[i].len;
> +                       continue;
> +               }
> +               for (j = 0; j < msg[i].len; j++, c++) {
> +                       SwI2cCmd_t *cmd = &res->SwI2cCmds[c];
>
> -                       if (msg[i].flags & I2C_M_RD)
> -                               msg->buf[j] = cmd->ReadWriteData;
> +                       msg[i].buf[j] = cmd->ReadWriteData;
>                 }
>         }
> -       r = num_done;
> -
> +       r = num_msgs;
>  fail:
>         kfree(req);
> -
>         return r;
>  }
>
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 31/40] drm/amdgpu: Fix width of I2C address
  2021-06-08 21:39 ` [PATCH 31/40] drm/amdgpu: Fix width of I2C address Luben Tuikov
@ 2021-06-10 21:09   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:09 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> The I2C address is kept as a 16-bit quantity in
> the kernel. The I2C_TAR::I2C_TAR field is 10-bit
> wide.
>
> Fix the width of the I2C address for Vega20 from 8
> bits to 16 bits to accommodate the full spectrum
> of I2C address space.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c | 19 +++++++++++--------
>  1 file changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> index e403ba556e5590..65035256756679 100644
> --- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> +++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> @@ -111,12 +111,15 @@ static void smu_v11_0_i2c_set_clock(struct i2c_adapter *control)
>         WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_SDA_HOLD, 20);
>  }
>
> -static void smu_v11_0_i2c_set_address(struct i2c_adapter *control, uint8_t address)
> +static void smu_v11_0_i2c_set_address(struct i2c_adapter *control, u16 address)
>  {
>         struct amdgpu_device *adev = to_amdgpu_device(control);
>
> -       /* We take 7-bit addresses raw */
> -       WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_TAR, (address & 0xFF));
> +       /* The IC_TAR::IC_TAR field is 10-bits wide.
> +        * It takes a 7-bit or 10-bit addresses as an address,
> +        * i.e. no read/write bit--no wire format, just the address.
> +        */
> +       WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_TAR, address & 0x3FF);
>  }
>
>  static uint32_t smu_v11_0_i2c_poll_tx_status(struct i2c_adapter *control)
> @@ -215,8 +218,8 @@ static uint32_t smu_v11_0_i2c_poll_rx_status(struct i2c_adapter *control)
>   * Returns 0 on success or error.
>   */
>  static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
> -                                 uint8_t address, uint8_t *data,
> -                                 uint32_t numbytes, uint32_t i2c_flag)
> +                                      u16 address, u8 *data,
> +                                      u32 numbytes, u32 i2c_flag)
>  {
>         struct amdgpu_device *adev = to_amdgpu_device(control);
>         uint32_t bytes_sent, reg, ret = 0;
> @@ -225,7 +228,7 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
>         bytes_sent = 0;
>
>         DRM_DEBUG_DRIVER("I2C_Transmit(), address = %x, bytes = %d , data: ",
> -                (uint16_t)address, numbytes);
> +                        address, numbytes);
>
>         if (drm_debug_enabled(DRM_UT_DRIVER)) {
>                 print_hex_dump(KERN_INFO, "data: ", DUMP_PREFIX_NONE,
> @@ -318,8 +321,8 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
>   * Returns 0 on success or error.
>   */
>  static uint32_t smu_v11_0_i2c_receive(struct i2c_adapter *control,
> -                                uint8_t address, uint8_t *data,
> -                                uint32_t numbytes, uint8_t i2c_flag)
> +                                     u16 address, u8 *data,
> +                                     u32 numbytes, u32 i2c_flag)
>  {
>         struct amdgpu_device *adev = to_amdgpu_device(control);
>         uint32_t bytes_received, ret = I2C_OK;
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 32/40] drm/amdgpu: Return result fix in RAS
  2021-06-08 21:39 ` [PATCH 32/40] drm/amdgpu: Return result fix in RAS Luben Tuikov
@ 2021-06-10 21:11   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:11 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> The low level EEPROM write method, doesn't return
> 1, but the number of bytes written. Thus do not
> compare to 1, instead, compare to greater than 0
> for success.
>
> Other cleanup: if the lower layers returned
> -errno, then return that, as opposed to
> overwriting the error code with one-fits-all
> -EINVAL. For instance, some return -EAGAIN.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c    |  3 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       | 22 +++++++++++--------
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    |  2 +-
>  drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c    |  3 +--
>  4 files changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> index a5a87affedabf1..a4815af111ed12 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> @@ -105,8 +105,7 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap, u32 eeprom_addr,
>         int r;
>         u16 len;
>
> -       r = 0;
> -       for ( ; buf_size > 0;
> +       for (r = 0; buf_size > 0;
>               buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
>                 /* Set the EEPROM address we want to write to/read from.
>                  */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index e3ad081eddd40b..66c96c65e7eeb9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -355,8 +355,9 @@ static int amdgpu_ras_debugfs_ctrl_parse_data(struct file *f,
>   *     to see which blocks support RAS on a particular asic.
>   *
>   */
> -static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f, const char __user *buf,
> -               size_t size, loff_t *pos)
> +static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f,
> +                                            const char __user *buf,
> +                                            size_t size, loff_t *pos)
>  {
>         struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
>         struct ras_debug_if data;
> @@ -370,7 +371,7 @@ static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f, const char __user *
>
>         ret = amdgpu_ras_debugfs_ctrl_parse_data(f, buf, size, pos, &data);
>         if (ret)
> -               return -EINVAL;
> +               return ret;
>
>         if (data.op == 3) {
>                 ret = amdgpu_reserve_page_direct(adev, data.inject.address);
> @@ -439,21 +440,24 @@ static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f, const char __user *
>   * will reset EEPROM table to 0 entries.
>   *
>   */
> -static ssize_t amdgpu_ras_debugfs_eeprom_write(struct file *f, const char __user *buf,
> -               size_t size, loff_t *pos)
> +static ssize_t amdgpu_ras_debugfs_eeprom_write(struct file *f,
> +                                              const char __user *buf,
> +                                              size_t size, loff_t *pos)
>  {
>         struct amdgpu_device *adev =
>                 (struct amdgpu_device *)file_inode(f)->i_private;
>         int ret;
>
>         ret = amdgpu_ras_eeprom_reset_table(
> -                       &(amdgpu_ras_get_context(adev)->eeprom_control));
> +               &(amdgpu_ras_get_context(adev)->eeprom_control));
>
> -       if (ret == 1) {
> +       if (ret > 0) {
> +               /* Something was written to EEPROM.
> +                */
>                 amdgpu_ras_get_context(adev)->flags = RAS_DEFAULT_FLAGS;
>                 return size;
>         } else {
> -               return -EIO;
> +               return ret;
>         }
>  }
>
> @@ -1991,7 +1995,7 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
>         kfree(*data);
>         con->eh_data = NULL;
>  out:
> -       dev_warn(adev->dev, "Failed to initialize ras recovery!\n");
> +       dev_warn(adev->dev, "Failed to initialize ras recovery! (%d)\n", ret);
>
>         /*
>          * Except error threshold exceeding case, other failure cases in this
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index 17cea35275e46c..dc48c556398039 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -335,7 +335,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
>                 ret = amdgpu_ras_eeprom_reset_table(control);
>         }
>
> -       return ret == 1 ? 0 : -EIO;
> +       return ret > 0 ? 0 : -EIO;
>  }
>
>  static void __encode_table_record_to_buff(struct amdgpu_ras_eeprom_control *control,
> diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> index 65035256756679..7f48ee020bc03e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> +++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> @@ -222,7 +222,7 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
>                                        u32 numbytes, u32 i2c_flag)
>  {
>         struct amdgpu_device *adev = to_amdgpu_device(control);
> -       uint32_t bytes_sent, reg, ret = 0;
> +       u32 bytes_sent, reg, ret = I2C_OK;
>         unsigned long  timeout_counter;
>
>         bytes_sent = 0;
> @@ -290,7 +290,6 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
>         }
>
>         ret = smu_v11_0_i2c_poll_tx_status(control);
> -
>  Err:
>         /* Any error, no point in proceeding */
>         if (ret != I2C_OK) {
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 33/40] drm/amd/pm: Fix a bug in i2c_xfer
  2021-06-08 21:39 ` [PATCH 33/40] drm/amd/pm: Fix a bug in i2c_xfer Luben Tuikov
@ 2021-06-10 21:12   ` Alex Deucher
  2021-06-10 22:26     ` Luben Tuikov
  0 siblings, 1 reply; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:12 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: Alex Deucher, amd-gfx list

On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> "req" is now a pointer , i.e. it is no longer
> allocated on the stack, thus taking its reference
> and passing that is a bug.
>
> This commit fixes this bug.
>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Can we just squash this into the original patch where this was broken?
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 2 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 2 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> index 0db79a5236e1f1..7d9a2946806f58 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> @@ -1957,7 +1957,7 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
>                 }
>         }
>         mutex_lock(&adev->smu.mutex);
> -       r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
> +       r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
>         mutex_unlock(&adev->smu.mutex);
>         if (r)
>                 goto fail;
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index 2acf54967c6ab1..0568cbfb023459 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -2752,7 +2752,7 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
>                 }
>         }
>         mutex_lock(&adev->smu.mutex);
> -       r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
> +       r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
>         mutex_unlock(&adev->smu.mutex);
>         if (r)
>                 goto fail;
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> index 44ca3b3f83f4d9..091b3339faadb9 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> @@ -3440,7 +3440,7 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
>                 }
>         }
>         mutex_lock(&adev->smu.mutex);
> -       r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
> +       r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
>         mutex_unlock(&adev->smu.mutex);
>         if (r)
>                 goto fail;
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 34/40] drm/amdgpu: Fix amdgpu_ras_eeprom_init()
  2021-06-08 21:39 ` [PATCH 34/40] drm/amdgpu: Fix amdgpu_ras_eeprom_init() Luben Tuikov
@ 2021-06-10 21:12   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:12 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: Alexander Deucher, Andrey Grodzovsky, amd-gfx list

On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> No need to account for the 2 bytes of EEPROM
> address--this is now well abstracted away by
> the fixes the the lower layers.
>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index dc48c556398039..7d0f9e1e62dc4f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -306,7 +306,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
>                 return ret;
>         }
>
> -       __decode_table_header_from_buff(hdr, &buff[2]);
> +       __decode_table_header_from_buff(hdr, buff);
>
>         if (hdr->header == RAS_TABLE_HDR_VAL) {
>                 control->num_recs = (hdr->tbl_size - RAS_TABLE_HEADER_SIZE) /
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 36/40] drm/amdgpu: Use explicit cardinality for clarity
  2021-06-08 21:39 ` [PATCH 36/40] drm/amdgpu: Use explicit cardinality for clarity Luben Tuikov
@ 2021-06-10 21:17   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:17 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Alexander Deucher, John Clements, Guchun Chen, amd-gfx list,
	Hawking Zhang

On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> RAS_MAX_RECORD_NUM may mean the maximum record
> number, as in the maximum house number on your
> street, or it may mean the maximum number of
> records, as in the count of records, which is also
> a number. To make this distinction whether the
> number is ordinal (index) or cardinal (count),
> rename this macro to RAS_MAX_RECORD_COUNT.
>
> This makes it easy to understand what it refers
> to, especially when we compute quantities such as,
> how many records do we have left in the table,
> especially when there are so many other numbers,
> quantities and numerical macros around.
>
> Also rename the long,
> amdgpu_ras_eeprom_get_record_max_length() to the
> more succinct and clear,
> amdgpu_ras_eeprom_max_record_count().
>
> When computing the threshold, which also deals
> with counts, i.e. "how many", use cardinal
> "max_eeprom_records_count", than the quantitative
> "max_eeprom_records_len".
>
> Simplify the logic here and there, as well.
>
> Cc: Guchun Chen <guchun.chen@amd.com>
> Cc: John Clements <john.clements@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |  9 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       | 50 ++++++++-----------
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    |  8 +--
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  2 +-
>  4 files changed, 30 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 3de1accb060e37..0203f654576bcc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -853,11 +853,10 @@ MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = legac
>  module_param_named(reset_method, amdgpu_reset_method, int, 0444);
>
>  /**
> - * DOC: bad_page_threshold (int)
> - * Bad page threshold is to specify the threshold value of faulty pages
> - * detected by RAS ECC, that may result in GPU entering bad status if total
> - * faulty pages by ECC exceed threshold value and leave it for user's further
> - * check.
> + * DOC: bad_page_threshold (int) Bad page threshold is specifies the
> + * threshold value of faulty pages detected by RAS ECC, which may
> + * result in the GPU entering bad status when the number of total
> + * faulty pages by ECC exceeds the threshold value.
>   */
>  MODULE_PARM_DESC(bad_page_threshold, "Bad page threshold(-1 = auto(default value), 0 = disable bad page retirement)");
>  module_param_named(bad_page_threshold, amdgpu_bad_page_threshold, int, 0444);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index 66c96c65e7eeb9..95ab400b641af0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -71,8 +71,8 @@ const char *ras_block_string[] = {
>  /* inject address is 52 bits */
>  #define        RAS_UMC_INJECT_ADDR_LIMIT       (0x1ULL << 52)
>
> -/* typical ECC bad page rate(1 bad page per 100MB VRAM) */
> -#define RAS_BAD_PAGE_RATE              (100 * 1024 * 1024ULL)
> +/* typical ECC bad page rate is 1 bad page per 100MB VRAM */
> +#define RAS_BAD_PAGE_COVER              (100 * 1024 * 1024ULL)
>
>  enum amdgpu_ras_retire_page_reservation {
>         AMDGPU_RAS_RETIRE_PAGE_RESERVED,
> @@ -1841,27 +1841,24 @@ int amdgpu_ras_save_bad_pages(struct amdgpu_device *adev)
>  static int amdgpu_ras_load_bad_pages(struct amdgpu_device *adev)
>  {
>         struct amdgpu_ras_eeprom_control *control =
> -                                       &adev->psp.ras.ras->eeprom_control;
> -       struct eeprom_table_record *bps = NULL;
> -       int ret = 0;
> +               &adev->psp.ras.ras->eeprom_control;
> +       struct eeprom_table_record *bps;
> +       int ret;
>
>         /* no bad page record, skip eeprom access */
> -       if (!control->num_recs || (amdgpu_bad_page_threshold == 0))
> -               return ret;
> +       if (control->num_recs == 0 || amdgpu_bad_page_threshold == 0)
> +               return 0;
>
>         bps = kcalloc(control->num_recs, sizeof(*bps), GFP_KERNEL);
>         if (!bps)
>                 return -ENOMEM;
>
> -       if (amdgpu_ras_eeprom_read(control, bps, control->num_recs)) {
> +       ret = amdgpu_ras_eeprom_read(control, bps, control->num_recs);
> +       if (ret)
>                 dev_err(adev->dev, "Failed to load EEPROM table records!");
> -               ret = -EIO;
> -               goto out;
> -       }
> -
> -       ret = amdgpu_ras_add_bad_pages(adev, bps, control->num_recs);
> +       else
> +               ret = amdgpu_ras_add_bad_pages(adev, bps, control->num_recs);
>
> -out:
>         kfree(bps);
>         return ret;
>  }
> @@ -1901,11 +1898,9 @@ static bool amdgpu_ras_check_bad_page(struct amdgpu_device *adev,
>  }
>
>  static void amdgpu_ras_validate_threshold(struct amdgpu_device *adev,
> -                                       uint32_t max_length)
> +                                         uint32_t max_count)
>  {
>         struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
> -       int tmp_threshold = amdgpu_bad_page_threshold;
> -       u64 val;
>
>         /*
>          * Justification of value bad_page_cnt_threshold in ras structure
> @@ -1926,18 +1921,15 @@ static void amdgpu_ras_validate_threshold(struct amdgpu_device *adev,
>          *      take no effect.
>          */
>
> -       if (tmp_threshold < -1)
> -               tmp_threshold = -1;
> -       else if (tmp_threshold > max_length)
> -               tmp_threshold = max_length;
> +       if (amdgpu_bad_page_threshold < 0) {
> +               u64 val = adev->gmc.mc_vram_size;
>
> -       if (tmp_threshold == -1) {
> -               val = adev->gmc.mc_vram_size;
> -               do_div(val, RAS_BAD_PAGE_RATE);
> +               do_div(val, RAS_BAD_PAGE_COVER);
>                 con->bad_page_cnt_threshold = min(lower_32_bits(val),
> -                                               max_length);
> +                                                 max_count);
>         } else {
> -               con->bad_page_cnt_threshold = tmp_threshold;
> +               con->bad_page_cnt_threshold = min_t(int, max_count,
> +                                                   amdgpu_bad_page_threshold);
>         }
>  }
>
> @@ -1945,7 +1937,7 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
>  {
>         struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
>         struct ras_err_handler_data **data;
> -       uint32_t max_eeprom_records_len = 0;
> +       u32  max_eeprom_records_count = 0;
>         bool exc_err_limit = false;
>         int ret;
>
> @@ -1965,8 +1957,8 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
>         atomic_set(&con->in_recovery, 0);
>         con->adev = adev;
>
> -       max_eeprom_records_len = amdgpu_ras_eeprom_get_record_max_length();
> -       amdgpu_ras_validate_threshold(adev, max_eeprom_records_len);
> +       max_eeprom_records_count = amdgpu_ras_eeprom_max_record_count();
> +       amdgpu_ras_validate_threshold(adev, max_eeprom_records_count);
>
>         /* Todo: During test the SMU might fail to read the eeprom through I2C
>          * when the GPU is pending on XGMI reset during probe time
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index 54ef31594accd9..21e1e59e4857ff 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -54,7 +54,7 @@
>  #define RAS_TBL_SIZE_BYTES      (256 * 1024)
>  #define RAS_HDR_START           0
>  #define RAS_RECORD_START        (RAS_HDR_START + RAS_TABLE_HEADER_SIZE)
> -#define RAS_MAX_RECORD_NUM      ((RAS_TBL_SIZE_BYTES - RAS_TABLE_HEADER_SIZE) \
> +#define RAS_MAX_RECORD_COUNT    ((RAS_TBL_SIZE_BYTES - RAS_TABLE_HEADER_SIZE) \
>                                  / RAS_TABLE_RECORD_SIZE)
>
>  #define to_amdgpu_device(x) (container_of(x, struct amdgpu_ras, eeprom_control))->adev
> @@ -532,7 +532,7 @@ static int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
>                  * TODO - Check the assumption is correct
>                  */
>                 control->num_recs += num;
> -               control->num_recs %= RAS_MAX_RECORD_NUM;
> +               control->num_recs %= RAS_MAX_RECORD_COUNT;
>                 control->tbl_hdr.tbl_size += RAS_TABLE_RECORD_SIZE * num;
>                 if (control->tbl_hdr.tbl_size > RAS_TBL_SIZE_BYTES)
>                         control->tbl_hdr.tbl_size = RAS_TABLE_HEADER_SIZE +
> @@ -568,9 +568,9 @@ int amdgpu_ras_eeprom_write(struct amdgpu_ras_eeprom_control *control,
>         return amdgpu_ras_eeprom_xfer(control, records, num, true);
>  }
>
> -inline uint32_t amdgpu_ras_eeprom_get_record_max_length(void)
> +inline uint32_t amdgpu_ras_eeprom_max_record_count(void)
>  {
> -       return RAS_MAX_RECORD_NUM;
> +       return RAS_MAX_RECORD_COUNT;
>  }
>
>  /* Used for testing if bugs encountered */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> index 4906ed9fb8cdd3..504729b8053759 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> @@ -88,7 +88,7 @@ int amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control *control,
>  int amdgpu_ras_eeprom_write(struct amdgpu_ras_eeprom_control *control,
>                             struct eeprom_table_record *records, const u32 num);
>
> -inline uint32_t amdgpu_ras_eeprom_get_record_max_length(void);
> +inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
>
>  void amdgpu_ras_eeprom_test(struct amdgpu_ras_eeprom_control *control);
>
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 39/40] drm/amdgpu: Fix koops when accessing RAS EEPROM
  2021-06-08 21:39 ` [PATCH 39/40] drm/amdgpu: Fix koops when accessing RAS EEPROM Luben Tuikov
@ 2021-06-10 21:23   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:23 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Alexander Deucher, John Clements, amd-gfx list, Hawking Zhang

On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> Debugfs RAS EEPROM files are available when
> the ASIC supports RAS, and when the debugfs is
> enabled, an also when "ras_enable" module
> parameter is set to 0. However in this case,
> we get a kernel oops when accessing some of
> the "ras_..." controls in debugfs. The reason
> for this is that struct amdgpu_ras::adev is
> unset. This commit sets it, thus enabling access
> to those facilities. Note that this facilitates
> EEPROM access and not necessarily RAS features or
> functionality.
>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: John Clements <john.clements@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index d791a360a92366..772d87701ad4a8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1947,11 +1947,20 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
>         bool exc_err_limit = false;
>         int ret;
>
> -       if (adev->ras_enabled && con)
> -               data = &con->eh_data;
> -       else
> +       if (!con)
> +               return 0;
> +
> +       /* Allow access to RAS EEPROM via debugfs, when the ASIC
> +        * supports RAS and debugfs is enabled, but when
> +        * adev->ras_enabled is unset, i.e. when "ras_enable"
> +        * module parameter is set to 0.
> +        */
> +       con->adev = adev;
> +
> +       if (!adev->ras_enabled)
>                 return 0;
>
> +       data = &con->eh_data;
>         *data = kmalloc(sizeof(**data), GFP_KERNEL | __GFP_ZERO);
>         if (!*data) {
>                 ret = -ENOMEM;
> @@ -1961,7 +1970,6 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
>         mutex_init(&con->recovery_lock);
>         INIT_WORK(&con->recovery_work, amdgpu_ras_do_recovery);
>         atomic_set(&con->in_recovery, 0);
> -       con->adev = adev;
>
>         max_eeprom_records_count = amdgpu_ras_eeprom_max_record_count();
>         amdgpu_ras_validate_threshold(adev, max_eeprom_records_count);
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 40/40] drm/amdgpu: Use a single loop
  2021-06-08 21:39 ` [PATCH 40/40] drm/amdgpu: Use a single loop Luben Tuikov
@ 2021-06-10 21:25   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-10 21:25 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: Alexander Deucher, Andrey Grodzovsky, amd-gfx list

On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> In smu_v11_0_i2c_transmit() use a single loop to
> transmit bytes, instead of two nested loops.
>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c | 72 ++++++++++------------
>  1 file changed, 34 insertions(+), 38 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> index 7f48ee020bc03e..751ea2517c4380 100644
> --- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> +++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
> @@ -243,49 +243,45 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
>         /* Clear status bits */
>         smu_v11_0_i2c_clear_status(control);
>
> -
>         timeout_counter = jiffies + msecs_to_jiffies(20);
>
>         while (numbytes > 0) {
>                 reg = RREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_STATUS);
> -               if (REG_GET_FIELD(reg, CKSVII2C_IC_STATUS, TFNF)) {
> -                       do {
> -                               reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, DAT, data[bytes_sent]);
> -
> -                               /* Final message, final byte, must
> -                                * generate a STOP, to release the
> -                                * bus, i.e. don't hold SCL low.
> -                                */
> -                               if (numbytes == 1 && i2c_flag & I2C_M_STOP)
> -                                       reg = REG_SET_FIELD(reg,
> -                                                           CKSVII2C_IC_DATA_CMD,
> -                                                           STOP, 1);
> -
> -                               if (bytes_sent == 0 && i2c_flag & I2C_X_RESTART)
> -                                       reg = REG_SET_FIELD(reg,
> -                                                           CKSVII2C_IC_DATA_CMD,
> -                                                           RESTART, 1);
> -
> -                               /* Write */
> -                               reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, CMD, 0);
> -                               WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_DATA_CMD, reg);
> -
> -                               /* Record that the bytes were transmitted */
> -                               bytes_sent++;
> -                               numbytes--;
> -
> -                               reg = RREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_STATUS);
> -
> -                       } while (numbytes &&  REG_GET_FIELD(reg, CKSVII2C_IC_STATUS, TFNF));
> -               }
> +               if (!REG_GET_FIELD(reg, CKSVII2C_IC_STATUS, TFNF)) {
> +                       /*
> +                        * We waited for too long for the transmission
> +                        * FIFO to become not-full.  Exit the loop
> +                        * with error.
> +                        */
> +                       if (time_after(jiffies, timeout_counter)) {
> +                               ret |= I2C_SW_TIMEOUT;
> +                               goto Err;
> +                       }
> +               } else {
> +                       reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, DAT,
> +                                           data[bytes_sent]);
>
> -               /*
> -                * We waited too long for the transmission FIFO to become not-full.
> -                * Exit the loop with error.
> -                */
> -               if (time_after(jiffies, timeout_counter)) {
> -                       ret |= I2C_SW_TIMEOUT;
> -                       goto Err;
> +                       /* Final message, final byte, must generate a
> +                        * STOP to release the bus, i.e. don't hold
> +                        * SCL low.
> +                        */
> +                       if (numbytes == 1 && i2c_flag & I2C_M_STOP)
> +                               reg = REG_SET_FIELD(reg,
> +                                                   CKSVII2C_IC_DATA_CMD,
> +                                                   STOP, 1);
> +
> +                       if (bytes_sent == 0 && i2c_flag & I2C_X_RESTART)
> +                               reg = REG_SET_FIELD(reg,
> +                                                   CKSVII2C_IC_DATA_CMD,
> +                                                   RESTART, 1);
> +
> +                       /* Write */
> +                       reg = REG_SET_FIELD(reg, CKSVII2C_IC_DATA_CMD, CMD, 0);
> +                       WREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_DATA_CMD, reg);
> +
> +                       /* Record that the bytes were transmitted */
> +                       bytes_sent++;
> +                       numbytes--;
>                 }
>         }
>
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 33/40] drm/amd/pm: Fix a bug in i2c_xfer
  2021-06-10 21:12   ` Alex Deucher
@ 2021-06-10 22:26     ` Luben Tuikov
  0 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-10 22:26 UTC (permalink / raw)
  To: Alex Deucher; +Cc: Alex Deucher, amd-gfx list

On 2021-06-10 5:12 p.m., Alex Deucher wrote:
> On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>> "req" is now a pointer , i.e. it is no longer
>> allocated on the stack, thus taking its reference
>> and passing that is a bug.
>>
>> This commit fixes this bug.
>>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> Can we just squash this into the original patch where this was broken?
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

Yeah, I'll do this--it'll be better this way.

Regards,
Luben

>
>> ---
>>  drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c       | 2 +-
>>  drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 2 +-
>>  drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +-
>>  3 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
>> index 0db79a5236e1f1..7d9a2946806f58 100644
>> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
>> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
>> @@ -1957,7 +1957,7 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
>>                 }
>>         }
>>         mutex_lock(&adev->smu.mutex);
>> -       r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
>> +       r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
>>         mutex_unlock(&adev->smu.mutex);
>>         if (r)
>>                 goto fail;
>> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
>> index 2acf54967c6ab1..0568cbfb023459 100644
>> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
>> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
>> @@ -2752,7 +2752,7 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
>>                 }
>>         }
>>         mutex_lock(&adev->smu.mutex);
>> -       r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
>> +       r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
>>         mutex_unlock(&adev->smu.mutex);
>>         if (r)
>>                 goto fail;
>> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
>> index 44ca3b3f83f4d9..091b3339faadb9 100644
>> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
>> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
>> @@ -3440,7 +3440,7 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
>>                 }
>>         }
>>         mutex_lock(&adev->smu.mutex);
>> -       r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, &req, true);
>> +       r = smu_cmn_update_table(&adev->smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
>>         mutex_unlock(&adev->smu.mutex);
>>         if (r)
>>                 goto fail;
>> --
>> 2.32.0
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cluben.tuikov%40amd.com%7Ca961cab9226c41c5fde408d92c54721c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637589563470538602%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=LQxzFaoGKixKHL8HD2YzWDxv1FYO7oXhW4y0UMpzslY%3D&amp;reserved=0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 20/40] drm/amdgpu: EEPROM respects I2C quirks
  2021-06-08 21:39 ` [PATCH 20/40] drm/amdgpu: EEPROM respects I2C quirks Luben Tuikov
@ 2021-06-11 17:01   ` Alex Deucher
  2021-06-11 17:17     ` Luben Tuikov
  0 siblings, 1 reply; 74+ messages in thread
From: Alex Deucher @ 2021-06-11 17:01 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> Consult the i2c_adapter.quirks table for
> the maximum read/write data length per bus
> transaction. Do not exceed this transaction
> limit.
>
> Cc: Jean Delvare <jdelvare@suse.de>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
> Cc: Stanley Yang <Stanley.Yang@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c | 80 +++++++++++++++++-----
>  1 file changed, 64 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> index 7fdb5bd2fc8bc8..94aeda1c7f8ca0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
> @@ -32,20 +32,9 @@
>
>  #define EEPROM_OFFSET_SIZE 2
>
> -/**
> - * amdgpu_eeprom_xfer -- Read/write from/to an I2C EEPROM device
> - * @i2c_adap: pointer to the I2C adapter to use
> - * @slave_addr: I2C address of the slave device
> - * @eeprom_addr: EEPROM address from which to read/write
> - * @eeprom_buf: pointer to data buffer to read into/write from
> - * @buf_size: the size of @eeprom_buf
> - * @read: True if reading from the EEPROM, false if writing
> - *
> - * Returns the number of bytes read/written; -errno on error.
> - */
> -int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
> -                      u16 slave_addr, u16 eeprom_addr,
> -                      u8 *eeprom_buf, u16 buf_size, bool read)
> +static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
> +                               u16 slave_addr, u16 eeprom_addr,
> +                               u8 *eeprom_buf, u16 buf_size, bool read)
>  {
>         u8 eeprom_offset_buf[EEPROM_OFFSET_SIZE];
>         struct i2c_msg msgs[] = {
> @@ -65,8 +54,8 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>         u16 len;
>
>         r = 0;
> -       for (len = 0; buf_size > 0;
> -            buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
> +       for ( ; buf_size > 0;
> +             buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
>                 /* Set the EEPROM address we want to write to/read from.
>                  */
>                 msgs[0].buf[0] = (eeprom_addr >> 8) & 0xff;
> @@ -120,3 +109,62 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>
>         return r < 0 ? r : eeprom_buf - p;
>  }
> +
> +/**
> + * amdgpu_eeprom_xfer -- Read/write from/to an I2C EEPROM device
> + * @i2c_adap: pointer to the I2C adapter to use
> + * @slave_addr: I2C address of the slave device
> + * @eeprom_addr: EEPROM address from which to read/write
> + * @eeprom_buf: pointer to data buffer to read into/write from
> + * @buf_size: the size of @eeprom_buf
> + * @read: True if reading from the EEPROM, false if writing
> + *
> + * Returns the number of bytes read/written; -errno on error.
> + */
> +int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
> +                      u16 slave_addr, u16 eeprom_addr,
> +                      u8 *eeprom_buf, u16 buf_size, bool read)
> +{
> +       const struct i2c_adapter_quirks *quirks = i2c_adap->quirks;
> +       u16 limit;
> +
> +       if (!quirks)
> +               limit = 0;
> +       else if (read)
> +               limit = quirks->max_read_len;
> +       else
> +               limit = quirks->max_write_len;
> +
> +       if (limit == 0) {
> +               return __amdgpu_eeprom_xfer(i2c_adap, slave_addr, eeprom_addr,
> +                                           eeprom_buf, buf_size, read);
> +       } else if (limit <= EEPROM_OFFSET_SIZE) {
> +               dev_err_ratelimited(&i2c_adap->dev,
> +                                   "maddr:0x%04X size:0x%02X:quirk max_%s_len must be > %d",
> +                                   eeprom_addr, buf_size,
> +                                   read ? "read" : "write", EEPROM_OFFSET_SIZE);
> +               return -EINVAL;

I presume we handle this case properly at higher levels (i.e., split
up EEPROM updates into smaller transactions)?

Alex


> +       } else {
> +               u16 ps; /* Partial size */
> +               int res = 0, r;
> +
> +               /* The "limit" includes all data bytes sent/received,
> +                * which would include the EEPROM_OFFSET_SIZE bytes.
> +                * Account for them here.
> +                */
> +               limit -= EEPROM_OFFSET_SIZE;
> +               for ( ; buf_size > 0;
> +                     buf_size -= ps, eeprom_addr += ps, eeprom_buf += ps) {
> +                       ps = min(limit, buf_size);
> +
> +                       r = __amdgpu_eeprom_xfer(i2c_adap,
> +                                                slave_addr, eeprom_addr,
> +                                                eeprom_buf, ps, read);
> +                       if (r < 0)
> +                               return r;
> +                       res += r;
> +               }
> +
> +               return res;
> +       }
> +}
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 35/40] drm/amdgpu: Simplify RAS EEPROM checksum calculations
  2021-06-08 21:39 ` [PATCH 35/40] drm/amdgpu: Simplify RAS EEPROM checksum calculations Luben Tuikov
@ 2021-06-11 17:07   ` Alex Deucher
  0 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2021-06-11 17:07 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: Alexander Deucher, Andrey Grodzovsky, amd-gfx list

On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> Rename update_table_header() to
> write_table_header() as this function is actually
> writing it to EEPROM.
>
> Use kernel types; use u8 to carry around the
> checksum, in order to take advantage of arithmetic
> modulo 8-bits (256).
>
> Tidy up to 80 columns.
>
> When updating the checksum, just recalculate the
> whole thing.
>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 98 +++++++++----------
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  2 +-
>  2 files changed, 50 insertions(+), 50 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index 7d0f9e1e62dc4f..54ef31594accd9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -141,8 +141,8 @@ static void __decode_table_header_from_buff(struct amdgpu_ras_eeprom_table_heade
>         hdr->checksum         = le32_to_cpu(pp[4]);
>  }
>
> -static int __update_table_header(struct amdgpu_ras_eeprom_control *control,
> -                                unsigned char *buff)
> +static int __write_table_header(struct amdgpu_ras_eeprom_control *control,
> +                               unsigned char *buff)
>  {
>         int ret = 0;
>         struct amdgpu_device *adev = to_amdgpu_device(control);
> @@ -162,69 +162,74 @@ static int __update_table_header(struct amdgpu_ras_eeprom_control *control,
>         return ret;
>  }
>
> -static uint32_t  __calc_hdr_byte_sum(struct amdgpu_ras_eeprom_control *control)
> +static u8 __calc_hdr_byte_sum(const struct amdgpu_ras_eeprom_control *control)
>  {
>         int i;
> -       uint32_t tbl_sum = 0;
> +       u8 hdr_sum = 0;
> +       u8  *p;
> +       size_t sz;
>
>         /* Header checksum, skip checksum field in the calculation */
> -       for (i = 0; i < sizeof(control->tbl_hdr) - sizeof(control->tbl_hdr.checksum); i++)
> -               tbl_sum += *(((unsigned char *)&control->tbl_hdr) + i);
> +       sz = sizeof(control->tbl_hdr) - sizeof(control->tbl_hdr.checksum);
> +       p = (u8 *) &control->tbl_hdr;
> +       for (i = 0; i < sz; i++, p++)
> +               hdr_sum += *p;
>
> -       return tbl_sum;
> +       return hdr_sum;
>  }
>
> -static uint32_t  __calc_recs_byte_sum(struct eeprom_table_record *records,
> -                                     int num)
> +static u8 __calc_recs_byte_sum(const struct eeprom_table_record *record,
> +                              const int num)
>  {
>         int i, j;
> -       uint32_t tbl_sum = 0;
> +       u8  tbl_sum = 0;
> +
> +       if (!record)
> +               return 0;
>
>         /* Records checksum */
>         for (i = 0; i < num; i++) {
> -               struct eeprom_table_record *record = &records[i];
> +               u8 *p = (u8 *) &record[i];
>
> -               for (j = 0; j < sizeof(*record); j++) {
> -                       tbl_sum += *(((unsigned char *)record) + j);
> -               }
> +               for (j = 0; j < sizeof(*record); j++, p++)
> +                       tbl_sum += *p;
>         }
>
>         return tbl_sum;
>  }
>
> -static inline uint32_t  __calc_tbl_byte_sum(struct amdgpu_ras_eeprom_control *control,
> -                                 struct eeprom_table_record *records, int num)
> +static inline u8
> +__calc_tbl_byte_sum(struct amdgpu_ras_eeprom_control *control,
> +                   struct eeprom_table_record *records, int num)
>  {
> -       return __calc_hdr_byte_sum(control) + __calc_recs_byte_sum(records, num);
> +       return __calc_hdr_byte_sum(control) +
> +               __calc_recs_byte_sum(records, num);
>  }
>
> -/* Checksum = 256 -((sum of all table entries) mod 256) */
>  static void __update_tbl_checksum(struct amdgpu_ras_eeprom_control *control,
> -                                 struct eeprom_table_record *records, int num,
> -                                 uint32_t old_hdr_byte_sum)
> +                                 struct eeprom_table_record *records, int num)
>  {
> -       /*
> -        * This will update the table sum with new records.
> -        *
> -        * TODO: What happens when the EEPROM table is to be wrapped around
> -        * and old records from start will get overridden.
> -        */
> -
> -       /* need to recalculate updated header byte sum */
> -       control->tbl_byte_sum -= old_hdr_byte_sum;
> -       control->tbl_byte_sum += __calc_tbl_byte_sum(control, records, num);
> +       u8 v;
>
> -       control->tbl_hdr.checksum = 256 - (control->tbl_byte_sum % 256);
> +       control->tbl_byte_sum = __calc_tbl_byte_sum(control, records, num);
> +       /* Avoid 32-bit sign extension. */
> +       v = -control->tbl_byte_sum;
> +       control->tbl_hdr.checksum = v;
>  }
>
> -/* table sum mod 256 + checksum must equals 256 */
> -static bool __validate_tbl_checksum(struct amdgpu_ras_eeprom_control *control,
> -                           struct eeprom_table_record *records, int num)
> +static bool __verify_tbl_checksum(struct amdgpu_ras_eeprom_control *control,
> +                                 struct eeprom_table_record *records,
> +                                 int num)
>  {
> +       u8 result;
> +
>         control->tbl_byte_sum = __calc_tbl_byte_sum(control, records, num);
>
> -       if (control->tbl_hdr.checksum + (control->tbl_byte_sum % 256) != 256) {
> -               DRM_WARN("Checksum mismatch, checksum: %u ", control->tbl_hdr.checksum);
> +       result = (u8)control->tbl_hdr.checksum + control->tbl_byte_sum;
> +       if (result) {
> +               DRM_WARN("RAS table checksum mismatch: stored:0x%02X wants:0x%02hhX",
> +                        control->tbl_hdr.checksum,
> +                        -control->tbl_byte_sum);
>                 return false;
>         }
>
> @@ -232,8 +237,8 @@ static bool __validate_tbl_checksum(struct amdgpu_ras_eeprom_control *control,
>  }
>
>  static int amdgpu_ras_eeprom_correct_header_tag(
> -                               struct amdgpu_ras_eeprom_control *control,
> -                               uint32_t header)
> +       struct amdgpu_ras_eeprom_control *control,
> +       uint32_t header)
>  {
>         unsigned char buff[RAS_TABLE_HEADER_SIZE];
>         struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr;
> @@ -243,7 +248,7 @@ static int amdgpu_ras_eeprom_correct_header_tag(
>
>         mutex_lock(&control->tbl_mutex);
>         hdr->header = header;
> -       ret = __update_table_header(control, buff);
> +       ret = __write_table_header(control, buff);
>         mutex_unlock(&control->tbl_mutex);
>
>         return ret;
> @@ -262,11 +267,9 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control)
>         hdr->first_rec_offset = RAS_RECORD_START;
>         hdr->tbl_size = RAS_TABLE_HEADER_SIZE;
>
> -       control->tbl_byte_sum = 0;
> -       __update_tbl_checksum(control, NULL, 0, 0);
> +       __update_tbl_checksum(control, NULL, 0);
>         control->next_addr = RAS_RECORD_START;
> -
> -       ret = __update_table_header(control, buff);
> +       ret = __write_table_header(control, buff);
>
>         mutex_unlock(&control->tbl_mutex);
>
> @@ -521,8 +524,6 @@ static int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
>         }
>
>         if (write) {
> -               uint32_t old_hdr_byte_sum = __calc_hdr_byte_sum(control);
> -
>                 /*
>                  * Update table header with size and CRC and account for table
>                  * wrap around where the assumption is that we treat it as empty
> @@ -537,10 +538,9 @@ static int amdgpu_ras_eeprom_xfer(struct amdgpu_ras_eeprom_control *control,
>                         control->tbl_hdr.tbl_size = RAS_TABLE_HEADER_SIZE +
>                         control->num_recs * RAS_TABLE_RECORD_SIZE;
>
> -               __update_tbl_checksum(control, records, num, old_hdr_byte_sum);
> -
> -               __update_table_header(control, buffs);
> -       } else if (!__validate_tbl_checksum(control, records, num)) {
> +               __update_tbl_checksum(control, records, num);
> +               __write_table_header(control, buffs);
> +       } else if (!__verify_tbl_checksum(control, records, num)) {
>                 DRM_WARN("EEPROM Table checksum mismatch!");
>                 /* TODO Uncomment when EEPROM read/write is relliable */
>                 /* ret = -EIO; */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> index fa9c509a8e2f2b..4906ed9fb8cdd3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> @@ -48,7 +48,7 @@ struct amdgpu_ras_eeprom_control {
>         uint32_t next_addr;
>         unsigned int num_recs;
>         struct mutex tbl_mutex;
> -       uint32_t tbl_byte_sum;
> +       u8 tbl_byte_sum;
>  };
>
>  /*
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 38/40] drm/amdgpu: RAS EEPROM table is now in debugfs
  2021-06-08 21:39 ` [PATCH 38/40] drm/amdgpu: RAS EEPROM table is now in debugfs Luben Tuikov
@ 2021-06-11 17:16   ` Alex Deucher
  2021-06-11 17:30     ` Luben Tuikov
  0 siblings, 1 reply; 74+ messages in thread
From: Alex Deucher @ 2021-06-11 17:16 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Xinhui Pan, amd-gfx list, Alexander Deucher,
	John Clements, Hawking Zhang

On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> Add "ras_eeprom_size" file in debugfs, which
> reports the maximum size allocated to the RAS
> table in EEROM, as the number of bytes and the
> number of records it could store. For instance,
>
> $cat /sys/kernel/debug/dri/0/ras/ras_eeprom_size
> 262144 bytes or 10921 records
> $_
>
> Add "ras_eeprom_table" file in debugfs, which
> dumps the RAS table stored EEPROM, in a formatted
> way. For instance,
>
> $cat ras_eeprom_table
>  Signature    Version  FirstOffs       Size   Checksum
> 0x414D4452 0x00010000 0x00000014 0x000000EC 0x000000DA
> Index  Offset ErrType Bank/CU          TimeStamp      Offs/Addr MemChl MCUMCID    RetiredPage
>     0 0x00014      ue    0x00 0x00000000607608DC 0x000000000000   0x00    0x00 0x000000000000
>     1 0x0002C      ue    0x00 0x00000000607608DC 0x000000001000   0x00    0x00 0x000000000001
>     2 0x00044      ue    0x00 0x00000000607608DC 0x000000002000   0x00    0x00 0x000000000002
>     3 0x0005C      ue    0x00 0x00000000607608DC 0x000000003000   0x00    0x00 0x000000000003
>     4 0x00074      ue    0x00 0x00000000607608DC 0x000000004000   0x00    0x00 0x000000000004
>     5 0x0008C      ue    0x00 0x00000000607608DC 0x000000005000   0x00    0x00 0x000000000005
>     6 0x000A4      ue    0x00 0x00000000607608DC 0x000000006000   0x00    0x00 0x000000000006
>     7 0x000BC      ue    0x00 0x00000000607608DC 0x000000007000   0x00    0x00 0x000000000007
>     8 0x000D4      ue    0x00 0x00000000607608DD 0x000000008000   0x00    0x00 0x000000000008
> $_
>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: John Clements <john.clements@amd.com>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Cc: Xinhui Pan <xinhui.pan@amd.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Seems like a useful feature.  Just a few comments below.

Alex


> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |  12 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h       |   1 +
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 241 +++++++++++++++++-
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  10 +-
>  4 files changed, 252 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index 1424f2cc2076c1..d791a360a92366 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -404,9 +404,9 @@ static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f,
>                 /* umc ce/ue error injection for a bad page is not allowed */
>                 if ((data.head.block == AMDGPU_RAS_BLOCK__UMC) &&
>                     amdgpu_ras_check_bad_page(adev, data.inject.address)) {
> -                       dev_warn(adev->dev, "RAS WARN: 0x%llx has been marked "
> -                                       "as bad before error injection!\n",
> -                                       data.inject.address);
> +                       dev_warn(adev->dev, "RAS WARN: inject: 0x%llx has "
> +                                "already been marked as bad!\n",
> +                                data.inject.address);

This seems unrelated to this patch.

>                         break;
>                 }
>
> @@ -1301,6 +1301,12 @@ static struct dentry *amdgpu_ras_debugfs_create_ctrl_node(struct amdgpu_device *
>                            &con->bad_page_cnt_threshold);
>         debugfs_create_x32("ras_hw_enabled", 0444, dir, &adev->ras_hw_enabled);
>         debugfs_create_x32("ras_enabled", 0444, dir, &adev->ras_enabled);
> +       debugfs_create_file("ras_eeprom_size", S_IRUGO, dir, adev,
> +                           &amdgpu_ras_debugfs_eeprom_size_ops);
> +       con->de_ras_eeprom_table = debugfs_create_file("ras_eeprom_table",
> +                                                      S_IRUGO, dir, adev,
> +                                                      &amdgpu_ras_debugfs_eeprom_table_ops);
> +       amdgpu_ras_debugfs_set_ret_size(&con->eeprom_control);
>
>         /*
>          * After one uncorrectable error happens, usually GPU recovery will
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> index 256cea5d34f2b6..283afd791db107 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> @@ -318,6 +318,7 @@ struct amdgpu_ras {
>         /* sysfs */
>         struct device_attribute features_attr;
>         struct bin_attribute badpages_attr;
> +       struct dentry *de_ras_eeprom_table;
>         /* block array */
>         struct ras_manager *objs;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index dc4a845a32404c..677e379f5fb5e9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -27,6 +27,8 @@
>  #include <linux/bits.h>
>  #include "atom.h"
>  #include "amdgpu_eeprom.h"
> +#include <linux/debugfs.h>
> +#include <linux/uaccess.h>
>
>  #define EEPROM_I2C_MADDR_VEGA20         0x0
>  #define EEPROM_I2C_MADDR_ARCTURUS       0x40000
> @@ -70,6 +72,13 @@
>  #define RAS_OFFSET_TO_INDEX(_C, _O) (((_O) - \
>                                       (_C)->ras_record_offset) / RAS_TABLE_RECORD_SIZE)
>
> +/* Given a 0-based relative record index, 0, 1, 2, ..., etc., off
> + * of "fri", return the absolute record index off of the end of
> + * the table header.
> + */
> +#define RAS_RI_TO_AI(_C, _I) (((_I) + (_C)->ras_fri) % \
> +                             (_C)->ras_max_record_count)
> +
>  #define RAS_NUM_RECS(_tbl_hdr)  (((_tbl_hdr)->tbl_size - \
>                                   RAS_TABLE_HEADER_SIZE) / RAS_TABLE_RECORD_SIZE)
>
> @@ -77,13 +86,10 @@
>
>  static bool __is_ras_eeprom_supported(struct amdgpu_device *adev)
>  {
> -       if ((adev->asic_type == CHIP_VEGA20) ||
> -           (adev->asic_type == CHIP_ARCTURUS) ||
> -           (adev->asic_type == CHIP_SIENNA_CICHLID) ||
> -           (adev->asic_type == CHIP_ALDEBARAN))
> -               return true;
> -
> -       return false;
> +       return  adev->asic_type == CHIP_VEGA20 ||
> +               adev->asic_type == CHIP_ARCTURUS ||
> +               adev->asic_type == CHIP_SIENNA_CICHLID ||
> +               adev->asic_type == CHIP_ALDEBARAN;

Unrelated whitespace change.

>  }
>
>  static bool __get_eeprom_i2c_addr_arct(struct amdgpu_device *adev,
> @@ -258,6 +264,8 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control)
>         control->ras_num_recs = 0;
>         control->ras_fri = 0;
>
> +       amdgpu_ras_debugfs_set_ret_size(control);
> +
>         mutex_unlock(&control->ras_tbl_mutex);
>
>         return res;
> @@ -591,6 +599,8 @@ int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
>         res = amdgpu_ras_eeprom_append_table(control, record, num);
>         if (!res)
>                 res = amdgpu_ras_eeprom_update_header(control);
> +       if (!res)
> +               amdgpu_ras_debugfs_set_ret_size(control);
>
>         mutex_unlock(&control->ras_tbl_mutex);
>         return res;
> @@ -734,6 +744,223 @@ inline uint32_t amdgpu_ras_eeprom_max_record_count(void)
>         return RAS_MAX_RECORD_COUNT;
>  }
>
> +static ssize_t
> +amdgpu_ras_debugfs_eeprom_size_read(struct file *f, char __user *buf,
> +                                   size_t size, loff_t *pos)
> +{
> +       struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
> +       struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
> +       struct amdgpu_ras_eeprom_control *control = ras ? &ras->eeprom_control : NULL;
> +       u8 data[50];
> +       int res;
> +
> +       if (!size)
> +               return size;
> +
> +       if (!ras || !control) {
> +               res = snprintf(data, sizeof(data), "Not supported\n");
> +       } else {
> +               res = snprintf(data, sizeof(data), "%d bytes or %d records\n",
> +                              RAS_TBL_SIZE_BYTES, control->ras_max_record_count);
> +       }
> +
> +       if (*pos >= res)
> +               return 0;
> +
> +       res -= *pos;
> +       res = min_t(size_t, res, size);
> +
> +       if (copy_to_user(buf, &data[*pos], res))
> +               return -EINVAL;
> +
> +       *pos += res;
> +
> +       return res;
> +}
> +
> +const struct file_operations amdgpu_ras_debugfs_eeprom_size_ops = {
> +       .owner = THIS_MODULE,
> +       .read = amdgpu_ras_debugfs_eeprom_size_read,
> +       .write = NULL,
> +       .llseek = default_llseek,
> +};
> +
> +static const char *tbl_hdr_str = " Signature    Version  FirstOffs       Size   Checksum\n";
> +static const char *tbl_hdr_fmt = "0x%08X 0x%08X 0x%08X 0x%08X 0x%08X\n";
> +#define tbl_hdr_fmt_size (5 * (2+8) + 4 + 1)
> +static const char *rec_hdr_str = "Index  Offset ErrType Bank/CU          TimeStamp      Offs/Addr MemChl MCUMCID    RetiredPage\n";
> +static const char *rec_hdr_fmt = "%5d 0x%05X %7s    0x%02X 0x%016llX 0x%012llX   0x%02X    0x%02X 0x%012llX\n";
> +#define rec_hdr_fmt_size (5 + 1 + 7 + 1 + 7 + 1 + 7 + 1 + 18 + 1 + 14 + 1 + 6 + 1 + 7 + 1 + 14 + 1)
> +
> +static const char *record_err_type_str[AMDGPU_RAS_EEPROM_ERR_COUNT] = {
> +       "ignore",
> +       "re",
> +       "ue",
> +};
> +
> +static loff_t amdgpu_ras_debugfs_table_size(struct amdgpu_ras_eeprom_control *control)
> +{
> +       return strlen(tbl_hdr_str) + tbl_hdr_fmt_size +
> +               strlen(rec_hdr_str) + rec_hdr_fmt_size * control->ras_num_recs;
> +}
> +
> +void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control *control)
> +{
> +       struct amdgpu_ras *ras = container_of(control, struct amdgpu_ras,
> +                                             eeprom_control);
> +       struct dentry *de = ras->de_ras_eeprom_table;
> +
> +       if (de)
> +               d_inode(de)->i_size = amdgpu_ras_debugfs_table_size(control);
> +}
> +
> +static ssize_t amdgpu_ras_debugfs_table_read(struct file *f, char __user *buf,
> +                                            size_t size, loff_t *pos)
> +{
> +       struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
> +       struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
> +       struct amdgpu_ras_eeprom_control *control = &ras->eeprom_control;
> +       const size_t orig_size = size;
> +       int res = -EINVAL;
> +       size_t data_len;
> +
> +       mutex_lock(&control->ras_tbl_mutex);
> +
> +       /* We want *pos - data_len > 0, which means there's
> +        * bytes to be printed from data.
> +        */
> +       data_len = strlen(tbl_hdr_str);
> +       if (*pos < data_len) {
> +               data_len -= *pos;
> +               data_len = min_t(size_t, data_len, size);
> +               if (copy_to_user(buf, &tbl_hdr_str[*pos], data_len))
> +                       goto Out;
> +               buf += data_len;
> +               size -= data_len;
> +               *pos += data_len;
> +       }
> +
> +       data_len = strlen(tbl_hdr_str) + tbl_hdr_fmt_size;
> +       if (*pos < data_len && size > 0) {
> +               u8 data[tbl_hdr_fmt_size + 1];
> +               loff_t lpos;
> +
> +               snprintf(data, sizeof(data), tbl_hdr_fmt,
> +                        control->tbl_hdr.header,
> +                        control->tbl_hdr.version,
> +                        control->tbl_hdr.first_rec_offset,
> +                        control->tbl_hdr.tbl_size,
> +                        control->tbl_hdr.checksum);
> +
> +               data_len -= *pos;
> +               data_len = min_t(size_t, data_len, size);
> +               lpos = *pos - strlen(tbl_hdr_str);
> +               if (copy_to_user(buf, &data[lpos], data_len))
> +                       goto Out;
> +               buf += data_len;
> +               size -= data_len;
> +               *pos += data_len;
> +       }
> +
> +       data_len = strlen(tbl_hdr_str) + tbl_hdr_fmt_size + strlen(rec_hdr_str);
> +       if (*pos < data_len && size > 0) {
> +               loff_t lpos;
> +
> +               data_len -= *pos;
> +               data_len = min_t(size_t, data_len, size);
> +               lpos = *pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size;
> +               if (copy_to_user(buf, &rec_hdr_str[lpos], data_len))
> +                       goto Out;
> +               buf += data_len;
> +               size -= data_len;
> +               *pos += data_len;
> +       }
> +
> +       data_len = amdgpu_ras_debugfs_table_size(control);
> +       if (*pos < data_len && size > 0) {
> +               u8 dare[RAS_TABLE_RECORD_SIZE];
> +               u8 data[rec_hdr_fmt_size + 1];
> +               /* Find the starting record index
> +                */
> +               int s = (*pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size -
> +                        strlen(rec_hdr_str)) / rec_hdr_fmt_size;
> +               int r = (*pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size -
> +                        strlen(rec_hdr_str)) % rec_hdr_fmt_size;
> +               struct eeprom_table_record record;
> +
> +               for ( ; size > 0 && s < control->ras_num_recs; s++) {
> +                       u32 ai = RAS_RI_TO_AI(control, s);
> +                       /* Read a single record
> +                        */
> +                       res = __amdgpu_ras_eeprom_read(control, dare, ai, 1);
> +                       if (res)
> +                               goto Out;
> +                       __decode_table_record_from_buf(control, &record, dare);
> +                       snprintf(data, sizeof(data), rec_hdr_fmt,
> +                                s,
> +                                RAS_INDEX_TO_OFFSET(control, ai),
> +                                record_err_type_str[record.err_type],
> +                                record.bank,
> +                                record.ts,
> +                                record.offset,
> +                                record.mem_channel,
> +                                record.mcumc_id,
> +                                record.retired_page);
> +
> +                       data_len = min_t(size_t, rec_hdr_fmt_size - r, size);
> +                       if (copy_to_user(buf, &data[r], data_len))
> +                               return -EINVAL;
> +                       buf += data_len;
> +                       size -= data_len;
> +                       *pos += data_len;
> +                       r = 0;
> +               }
> +       }
> +       res = 0;
> +Out:
> +       mutex_unlock(&control->ras_tbl_mutex);
> +       return res < 0 ? res : orig_size - size;
> +}
> +
> +static ssize_t
> +amdgpu_ras_debugfs_eeprom_table_read(struct file *f, char __user *buf,
> +                                    size_t size, loff_t *pos)
> +{
> +       struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
> +       struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
> +       struct amdgpu_ras_eeprom_control *control = ras ? &ras->eeprom_control : NULL;
> +       u8 data[81];
> +       int res;
> +
> +       if (!size)
> +               return size;
> +
> +       if (!ras || !control) {
> +               res = snprintf(data, sizeof(data), "Not supported\n");
> +               if (*pos >= res)
> +                       return 0;
> +
> +               res -= *pos;
> +               res = min_t(size_t, res, size);
> +
> +               if (copy_to_user(buf, &data[*pos], res))
> +                       return -EINVAL;
> +
> +               *pos += res;
> +
> +               return res;
> +       } else {
> +               return amdgpu_ras_debugfs_table_read(f, buf, size, pos);
> +       }
> +}
> +
> +const struct file_operations amdgpu_ras_debugfs_eeprom_table_ops = {
> +       .owner = THIS_MODULE,
> +       .read = amdgpu_ras_debugfs_eeprom_table_read,
> +       .write = NULL,
> +       .llseek = default_llseek,
> +};
> +
>  /**
>   * __verify_ras_table_checksum -- verify the RAS EEPROM table checksum
>   * @control: pointer to control structure
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> index edb0195ea2eb8c..430e08ab3313a2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> @@ -29,9 +29,10 @@
>  struct amdgpu_device;
>
>  enum amdgpu_ras_eeprom_err_type {
> -       AMDGPU_RAS_EEPROM_ERR_PLACE_HOLDER,
> +       AMDGPU_RAS_EEPROM_ERR_NA,
>         AMDGPU_RAS_EEPROM_ERR_RECOVERABLE,
> -       AMDGPU_RAS_EEPROM_ERR_NON_RECOVERABLE
> +       AMDGPU_RAS_EEPROM_ERR_NON_RECOVERABLE,
> +       AMDGPU_RAS_EEPROM_ERR_COUNT,
>  };
>
>  struct amdgpu_ras_eeprom_table_header {
> @@ -121,4 +122,9 @@ int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
>
>  inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
>
> +void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control *control);
> +
> +extern const struct file_operations amdgpu_ras_debugfs_eeprom_size_ops;
> +extern const struct file_operations amdgpu_ras_debugfs_eeprom_table_ops;
> +
>  #endif // _AMDGPU_RAS_EEPROM_H
> --
> 2.32.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 20/40] drm/amdgpu: EEPROM respects I2C quirks
  2021-06-11 17:01   ` Alex Deucher
@ 2021-06-11 17:17     ` Luben Tuikov
  2021-06-11 17:37       ` Luben Tuikov
  0 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-11 17:17 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On 2021-06-11 1:01 p.m., Alex Deucher wrote:
> On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>> Consult the i2c_adapter.quirks table for
>> the maximum read/write data length per bus
>> transaction. Do not exceed this transaction
>> limit.
>>
>> Cc: Jean Delvare <jdelvare@suse.de>
>> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
>> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
>> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
>> Cc: Stanley Yang <Stanley.Yang@amd.com>
>> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c | 80 +++++++++++++++++-----
>>  1 file changed, 64 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
>> index 7fdb5bd2fc8bc8..94aeda1c7f8ca0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
>> @@ -32,20 +32,9 @@
>>
>>  #define EEPROM_OFFSET_SIZE 2
>>
>> -/**
>> - * amdgpu_eeprom_xfer -- Read/write from/to an I2C EEPROM device
>> - * @i2c_adap: pointer to the I2C adapter to use
>> - * @slave_addr: I2C address of the slave device
>> - * @eeprom_addr: EEPROM address from which to read/write
>> - * @eeprom_buf: pointer to data buffer to read into/write from
>> - * @buf_size: the size of @eeprom_buf
>> - * @read: True if reading from the EEPROM, false if writing
>> - *
>> - * Returns the number of bytes read/written; -errno on error.
>> - */
>> -int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>> -                      u16 slave_addr, u16 eeprom_addr,
>> -                      u8 *eeprom_buf, u16 buf_size, bool read)
>> +static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>> +                               u16 slave_addr, u16 eeprom_addr,
>> +                               u8 *eeprom_buf, u16 buf_size, bool read)
>>  {
>>         u8 eeprom_offset_buf[EEPROM_OFFSET_SIZE];
>>         struct i2c_msg msgs[] = {
>> @@ -65,8 +54,8 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>>         u16 len;
>>
>>         r = 0;
>> -       for (len = 0; buf_size > 0;
>> -            buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
>> +       for ( ; buf_size > 0;
>> +             buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
>>                 /* Set the EEPROM address we want to write to/read from.
>>                  */
>>                 msgs[0].buf[0] = (eeprom_addr >> 8) & 0xff;
>> @@ -120,3 +109,62 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>>
>>         return r < 0 ? r : eeprom_buf - p;
>>  }
>> +
>> +/**
>> + * amdgpu_eeprom_xfer -- Read/write from/to an I2C EEPROM device
>> + * @i2c_adap: pointer to the I2C adapter to use
>> + * @slave_addr: I2C address of the slave device
>> + * @eeprom_addr: EEPROM address from which to read/write
>> + * @eeprom_buf: pointer to data buffer to read into/write from
>> + * @buf_size: the size of @eeprom_buf
>> + * @read: True if reading from the EEPROM, false if writing
>> + *
>> + * Returns the number of bytes read/written; -errno on error.
>> + */
>> +int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>> +                      u16 slave_addr, u16 eeprom_addr,
>> +                      u8 *eeprom_buf, u16 buf_size, bool read)
>> +{
>> +       const struct i2c_adapter_quirks *quirks = i2c_adap->quirks;
>> +       u16 limit;
>> +
>> +       if (!quirks)
>> +               limit = 0;
>> +       else if (read)
>> +               limit = quirks->max_read_len;
>> +       else
>> +               limit = quirks->max_write_len;
>> +
>> +       if (limit == 0) {
>> +               return __amdgpu_eeprom_xfer(i2c_adap, slave_addr, eeprom_addr,
>> +                                           eeprom_buf, buf_size, read);
>> +       } else if (limit <= EEPROM_OFFSET_SIZE) {
>> +               dev_err_ratelimited(&i2c_adap->dev,
>> +                                   "maddr:0x%04X size:0x%02X:quirk max_%s_len must be > %d",
>> +                                   eeprom_addr, buf_size,
>> +                                   read ? "read" : "write", EEPROM_OFFSET_SIZE);
>> +               return -EINVAL;
> I presume we handle this case properly at higher levels (i.e., split
> up EEPROM updates into smaller transactions)?

Absolutely we do.
(We break it down twice: once per this limit and again per page size and page boundary. It'll work always. :-) )

But this is different--this means that the user has set a limit less than 2, which means we can't even send a set-address phase to set the EEPROM memory address offset we want to read or write from, and thus the chattiness.

I just noticed that it is less-than-or-equal, which means the smallest limit the user can set which would work is 3. But 2 would also work, then all transfers would be 2 bytes long. Does it matter? I guess I can change this from LTE to LT, to mean that a minimum transfer of 2 is the smallest we support. I've changed it to LT. :-)

Regards,
Luben

>
> Alex
>
>
>> +       } else {
>> +               u16 ps; /* Partial size */
>> +               int res = 0, r;
>> +
>> +               /* The "limit" includes all data bytes sent/received,
>> +                * which would include the EEPROM_OFFSET_SIZE bytes.
>> +                * Account for them here.
>> +                */
>> +               limit -= EEPROM_OFFSET_SIZE;
>> +               for ( ; buf_size > 0;
>> +                     buf_size -= ps, eeprom_addr += ps, eeprom_buf += ps) {
>> +                       ps = min(limit, buf_size);
>> +
>> +                       r = __amdgpu_eeprom_xfer(i2c_adap,
>> +                                                slave_addr, eeprom_addr,
>> +                                                eeprom_buf, ps, read);
>> +                       if (r < 0)
>> +                               return r;
>> +                       res += r;
>> +               }
>> +
>> +               return res;
>> +       }
>> +}
>> --
>> 2.32.0
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cluben.tuikov%40amd.com%7Cc8502a7f4dd94666468408d92cfa95e6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637590277035962948%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=UsBaf7trds%2BjmJ8yhIaMoLNdq2Rxk3EXY5jztgzjFL0%3D&amp;reserved=0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 38/40] drm/amdgpu: RAS EEPROM table is now in debugfs
  2021-06-11 17:16   ` Alex Deucher
@ 2021-06-11 17:30     ` Luben Tuikov
  2021-06-11 17:51       ` Alex Deucher
  0 siblings, 1 reply; 74+ messages in thread
From: Luben Tuikov @ 2021-06-11 17:30 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Andrey Grodzovsky, Xinhui Pan, amd-gfx list, Alexander Deucher,
	John Clements, Hawking Zhang

On 2021-06-11 1:16 p.m., Alex Deucher wrote:
> On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>> Add "ras_eeprom_size" file in debugfs, which
>> reports the maximum size allocated to the RAS
>> table in EEROM, as the number of bytes and the
>> number of records it could store. For instance,
>>
>> $cat /sys/kernel/debug/dri/0/ras/ras_eeprom_size
>> 262144 bytes or 10921 records
>> $_
>>
>> Add "ras_eeprom_table" file in debugfs, which
>> dumps the RAS table stored EEPROM, in a formatted
>> way. For instance,
>>
>> $cat ras_eeprom_table
>>  Signature    Version  FirstOffs       Size   Checksum
>> 0x414D4452 0x00010000 0x00000014 0x000000EC 0x000000DA
>> Index  Offset ErrType Bank/CU          TimeStamp      Offs/Addr MemChl MCUMCID    RetiredPage
>>     0 0x00014      ue    0x00 0x00000000607608DC 0x000000000000   0x00    0x00 0x000000000000
>>     1 0x0002C      ue    0x00 0x00000000607608DC 0x000000001000   0x00    0x00 0x000000000001
>>     2 0x00044      ue    0x00 0x00000000607608DC 0x000000002000   0x00    0x00 0x000000000002
>>     3 0x0005C      ue    0x00 0x00000000607608DC 0x000000003000   0x00    0x00 0x000000000003
>>     4 0x00074      ue    0x00 0x00000000607608DC 0x000000004000   0x00    0x00 0x000000000004
>>     5 0x0008C      ue    0x00 0x00000000607608DC 0x000000005000   0x00    0x00 0x000000000005
>>     6 0x000A4      ue    0x00 0x00000000607608DC 0x000000006000   0x00    0x00 0x000000000006
>>     7 0x000BC      ue    0x00 0x00000000607608DC 0x000000007000   0x00    0x00 0x000000000007
>>     8 0x000D4      ue    0x00 0x00000000607608DD 0x000000008000   0x00    0x00 0x000000000008
>> $_
>>
>> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
>> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
>> Cc: John Clements <john.clements@amd.com>
>> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
>> Cc: Xinhui Pan <xinhui.pan@amd.com>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> Seems like a useful feature.  Just a few comments below.
>
> Alex
>
>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |  12 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h       |   1 +
>>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 241 +++++++++++++++++-
>>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  10 +-
>>  4 files changed, 252 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
>> index 1424f2cc2076c1..d791a360a92366 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
>> @@ -404,9 +404,9 @@ static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f,
>>                 /* umc ce/ue error injection for a bad page is not allowed */
>>                 if ((data.head.block == AMDGPU_RAS_BLOCK__UMC) &&
>>                     amdgpu_ras_check_bad_page(adev, data.inject.address)) {
>> -                       dev_warn(adev->dev, "RAS WARN: 0x%llx has been marked "
>> -                                       "as bad before error injection!\n",
>> -                                       data.inject.address);
>> +                       dev_warn(adev->dev, "RAS WARN: inject: 0x%llx has "
>> +                                "already been marked as bad!\n",
>> +                                data.inject.address);
> This seems unrelated to this patch.

It's just cosmetic fix, to correctly align, as it seems that the previous alignment was arbitrary.
Just pressing TAB in Emacs does wonders. :-)

I was in this file and decided to fix this. It's just cosmetic. No functional change.

>
>>                         break;
>>                 }
>>
>> @@ -1301,6 +1301,12 @@ static struct dentry *amdgpu_ras_debugfs_create_ctrl_node(struct amdgpu_device *
>>                            &con->bad_page_cnt_threshold);
>>         debugfs_create_x32("ras_hw_enabled", 0444, dir, &adev->ras_hw_enabled);
>>         debugfs_create_x32("ras_enabled", 0444, dir, &adev->ras_enabled);
>> +       debugfs_create_file("ras_eeprom_size", S_IRUGO, dir, adev,
>> +                           &amdgpu_ras_debugfs_eeprom_size_ops);
>> +       con->de_ras_eeprom_table = debugfs_create_file("ras_eeprom_table",
>> +                                                      S_IRUGO, dir, adev,
>> +                                                      &amdgpu_ras_debugfs_eeprom_table_ops);
>> +       amdgpu_ras_debugfs_set_ret_size(&con->eeprom_control);
>>
>>         /*
>>          * After one uncorrectable error happens, usually GPU recovery will
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
>> index 256cea5d34f2b6..283afd791db107 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
>> @@ -318,6 +318,7 @@ struct amdgpu_ras {
>>         /* sysfs */
>>         struct device_attribute features_attr;
>>         struct bin_attribute badpages_attr;
>> +       struct dentry *de_ras_eeprom_table;
>>         /* block array */
>>         struct ras_manager *objs;
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>> index dc4a845a32404c..677e379f5fb5e9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>> @@ -27,6 +27,8 @@
>>  #include <linux/bits.h>
>>  #include "atom.h"
>>  #include "amdgpu_eeprom.h"
>> +#include <linux/debugfs.h>
>> +#include <linux/uaccess.h>
>>
>>  #define EEPROM_I2C_MADDR_VEGA20         0x0
>>  #define EEPROM_I2C_MADDR_ARCTURUS       0x40000
>> @@ -70,6 +72,13 @@
>>  #define RAS_OFFSET_TO_INDEX(_C, _O) (((_O) - \
>>                                       (_C)->ras_record_offset) / RAS_TABLE_RECORD_SIZE)
>>
>> +/* Given a 0-based relative record index, 0, 1, 2, ..., etc., off
>> + * of "fri", return the absolute record index off of the end of
>> + * the table header.
>> + */
>> +#define RAS_RI_TO_AI(_C, _I) (((_I) + (_C)->ras_fri) % \
>> +                             (_C)->ras_max_record_count)
>> +
>>  #define RAS_NUM_RECS(_tbl_hdr)  (((_tbl_hdr)->tbl_size - \
>>                                   RAS_TABLE_HEADER_SIZE) / RAS_TABLE_RECORD_SIZE)
>>
>> @@ -77,13 +86,10 @@
>>
>>  static bool __is_ras_eeprom_supported(struct amdgpu_device *adev)
>>  {
>> -       if ((adev->asic_type == CHIP_VEGA20) ||
>> -           (adev->asic_type == CHIP_ARCTURUS) ||
>> -           (adev->asic_type == CHIP_SIENNA_CICHLID) ||
>> -           (adev->asic_type == CHIP_ALDEBARAN))
>> -               return true;
>> -
>> -       return false;
>> +       return  adev->asic_type == CHIP_VEGA20 ||
>> +               adev->asic_type == CHIP_ARCTURUS ||
>> +               adev->asic_type == CHIP_SIENNA_CICHLID ||
>> +               adev->asic_type == CHIP_ALDEBARAN;
> Unrelated whitespace change.

It's more readable and succinct like this, no?

Do you want me to revert these? I mean, they're pleasing to have and change no functionality, and since I was in this file...

Regards,
Luben

>
>>  }
>>
>>  static bool __get_eeprom_i2c_addr_arct(struct amdgpu_device *adev,
>> @@ -258,6 +264,8 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control)
>>         control->ras_num_recs = 0;
>>         control->ras_fri = 0;
>>
>> +       amdgpu_ras_debugfs_set_ret_size(control);
>> +
>>         mutex_unlock(&control->ras_tbl_mutex);
>>
>>         return res;
>> @@ -591,6 +599,8 @@ int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
>>         res = amdgpu_ras_eeprom_append_table(control, record, num);
>>         if (!res)
>>                 res = amdgpu_ras_eeprom_update_header(control);
>> +       if (!res)
>> +               amdgpu_ras_debugfs_set_ret_size(control);
>>
>>         mutex_unlock(&control->ras_tbl_mutex);
>>         return res;
>> @@ -734,6 +744,223 @@ inline uint32_t amdgpu_ras_eeprom_max_record_count(void)
>>         return RAS_MAX_RECORD_COUNT;
>>  }
>>
>> +static ssize_t
>> +amdgpu_ras_debugfs_eeprom_size_read(struct file *f, char __user *buf,
>> +                                   size_t size, loff_t *pos)
>> +{
>> +       struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
>> +       struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
>> +       struct amdgpu_ras_eeprom_control *control = ras ? &ras->eeprom_control : NULL;
>> +       u8 data[50];
>> +       int res;
>> +
>> +       if (!size)
>> +               return size;
>> +
>> +       if (!ras || !control) {
>> +               res = snprintf(data, sizeof(data), "Not supported\n");
>> +       } else {
>> +               res = snprintf(data, sizeof(data), "%d bytes or %d records\n",
>> +                              RAS_TBL_SIZE_BYTES, control->ras_max_record_count);
>> +       }
>> +
>> +       if (*pos >= res)
>> +               return 0;
>> +
>> +       res -= *pos;
>> +       res = min_t(size_t, res, size);
>> +
>> +       if (copy_to_user(buf, &data[*pos], res))
>> +               return -EINVAL;
>> +
>> +       *pos += res;
>> +
>> +       return res;
>> +}
>> +
>> +const struct file_operations amdgpu_ras_debugfs_eeprom_size_ops = {
>> +       .owner = THIS_MODULE,
>> +       .read = amdgpu_ras_debugfs_eeprom_size_read,
>> +       .write = NULL,
>> +       .llseek = default_llseek,
>> +};
>> +
>> +static const char *tbl_hdr_str = " Signature    Version  FirstOffs       Size   Checksum\n";
>> +static const char *tbl_hdr_fmt = "0x%08X 0x%08X 0x%08X 0x%08X 0x%08X\n";
>> +#define tbl_hdr_fmt_size (5 * (2+8) + 4 + 1)
>> +static const char *rec_hdr_str = "Index  Offset ErrType Bank/CU          TimeStamp      Offs/Addr MemChl MCUMCID    RetiredPage\n";
>> +static const char *rec_hdr_fmt = "%5d 0x%05X %7s    0x%02X 0x%016llX 0x%012llX   0x%02X    0x%02X 0x%012llX\n";
>> +#define rec_hdr_fmt_size (5 + 1 + 7 + 1 + 7 + 1 + 7 + 1 + 18 + 1 + 14 + 1 + 6 + 1 + 7 + 1 + 14 + 1)
>> +
>> +static const char *record_err_type_str[AMDGPU_RAS_EEPROM_ERR_COUNT] = {
>> +       "ignore",
>> +       "re",
>> +       "ue",
>> +};
>> +
>> +static loff_t amdgpu_ras_debugfs_table_size(struct amdgpu_ras_eeprom_control *control)
>> +{
>> +       return strlen(tbl_hdr_str) + tbl_hdr_fmt_size +
>> +               strlen(rec_hdr_str) + rec_hdr_fmt_size * control->ras_num_recs;
>> +}
>> +
>> +void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control *control)
>> +{
>> +       struct amdgpu_ras *ras = container_of(control, struct amdgpu_ras,
>> +                                             eeprom_control);
>> +       struct dentry *de = ras->de_ras_eeprom_table;
>> +
>> +       if (de)
>> +               d_inode(de)->i_size = amdgpu_ras_debugfs_table_size(control);
>> +}
>> +
>> +static ssize_t amdgpu_ras_debugfs_table_read(struct file *f, char __user *buf,
>> +                                            size_t size, loff_t *pos)
>> +{
>> +       struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
>> +       struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
>> +       struct amdgpu_ras_eeprom_control *control = &ras->eeprom_control;
>> +       const size_t orig_size = size;
>> +       int res = -EINVAL;
>> +       size_t data_len;
>> +
>> +       mutex_lock(&control->ras_tbl_mutex);
>> +
>> +       /* We want *pos - data_len > 0, which means there's
>> +        * bytes to be printed from data.
>> +        */
>> +       data_len = strlen(tbl_hdr_str);
>> +       if (*pos < data_len) {
>> +               data_len -= *pos;
>> +               data_len = min_t(size_t, data_len, size);
>> +               if (copy_to_user(buf, &tbl_hdr_str[*pos], data_len))
>> +                       goto Out;
>> +               buf += data_len;
>> +               size -= data_len;
>> +               *pos += data_len;
>> +       }
>> +
>> +       data_len = strlen(tbl_hdr_str) + tbl_hdr_fmt_size;
>> +       if (*pos < data_len && size > 0) {
>> +               u8 data[tbl_hdr_fmt_size + 1];
>> +               loff_t lpos;
>> +
>> +               snprintf(data, sizeof(data), tbl_hdr_fmt,
>> +                        control->tbl_hdr.header,
>> +                        control->tbl_hdr.version,
>> +                        control->tbl_hdr.first_rec_offset,
>> +                        control->tbl_hdr.tbl_size,
>> +                        control->tbl_hdr.checksum);
>> +
>> +               data_len -= *pos;
>> +               data_len = min_t(size_t, data_len, size);
>> +               lpos = *pos - strlen(tbl_hdr_str);
>> +               if (copy_to_user(buf, &data[lpos], data_len))
>> +                       goto Out;
>> +               buf += data_len;
>> +               size -= data_len;
>> +               *pos += data_len;
>> +       }
>> +
>> +       data_len = strlen(tbl_hdr_str) + tbl_hdr_fmt_size + strlen(rec_hdr_str);
>> +       if (*pos < data_len && size > 0) {
>> +               loff_t lpos;
>> +
>> +               data_len -= *pos;
>> +               data_len = min_t(size_t, data_len, size);
>> +               lpos = *pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size;
>> +               if (copy_to_user(buf, &rec_hdr_str[lpos], data_len))
>> +                       goto Out;
>> +               buf += data_len;
>> +               size -= data_len;
>> +               *pos += data_len;
>> +       }
>> +
>> +       data_len = amdgpu_ras_debugfs_table_size(control);
>> +       if (*pos < data_len && size > 0) {
>> +               u8 dare[RAS_TABLE_RECORD_SIZE];
>> +               u8 data[rec_hdr_fmt_size + 1];
>> +               /* Find the starting record index
>> +                */
>> +               int s = (*pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size -
>> +                        strlen(rec_hdr_str)) / rec_hdr_fmt_size;
>> +               int r = (*pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size -
>> +                        strlen(rec_hdr_str)) % rec_hdr_fmt_size;
>> +               struct eeprom_table_record record;
>> +
>> +               for ( ; size > 0 && s < control->ras_num_recs; s++) {
>> +                       u32 ai = RAS_RI_TO_AI(control, s);
>> +                       /* Read a single record
>> +                        */
>> +                       res = __amdgpu_ras_eeprom_read(control, dare, ai, 1);
>> +                       if (res)
>> +                               goto Out;
>> +                       __decode_table_record_from_buf(control, &record, dare);
>> +                       snprintf(data, sizeof(data), rec_hdr_fmt,
>> +                                s,
>> +                                RAS_INDEX_TO_OFFSET(control, ai),
>> +                                record_err_type_str[record.err_type],
>> +                                record.bank,
>> +                                record.ts,
>> +                                record.offset,
>> +                                record.mem_channel,
>> +                                record.mcumc_id,
>> +                                record.retired_page);
>> +
>> +                       data_len = min_t(size_t, rec_hdr_fmt_size - r, size);
>> +                       if (copy_to_user(buf, &data[r], data_len))
>> +                               return -EINVAL;
>> +                       buf += data_len;
>> +                       size -= data_len;
>> +                       *pos += data_len;
>> +                       r = 0;
>> +               }
>> +       }
>> +       res = 0;
>> +Out:
>> +       mutex_unlock(&control->ras_tbl_mutex);
>> +       return res < 0 ? res : orig_size - size;
>> +}
>> +
>> +static ssize_t
>> +amdgpu_ras_debugfs_eeprom_table_read(struct file *f, char __user *buf,
>> +                                    size_t size, loff_t *pos)
>> +{
>> +       struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
>> +       struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
>> +       struct amdgpu_ras_eeprom_control *control = ras ? &ras->eeprom_control : NULL;
>> +       u8 data[81];
>> +       int res;
>> +
>> +       if (!size)
>> +               return size;
>> +
>> +       if (!ras || !control) {
>> +               res = snprintf(data, sizeof(data), "Not supported\n");
>> +               if (*pos >= res)
>> +                       return 0;
>> +
>> +               res -= *pos;
>> +               res = min_t(size_t, res, size);
>> +
>> +               if (copy_to_user(buf, &data[*pos], res))
>> +                       return -EINVAL;
>> +
>> +               *pos += res;
>> +
>> +               return res;
>> +       } else {
>> +               return amdgpu_ras_debugfs_table_read(f, buf, size, pos);
>> +       }
>> +}
>> +
>> +const struct file_operations amdgpu_ras_debugfs_eeprom_table_ops = {
>> +       .owner = THIS_MODULE,
>> +       .read = amdgpu_ras_debugfs_eeprom_table_read,
>> +       .write = NULL,
>> +       .llseek = default_llseek,
>> +};
>> +
>>  /**
>>   * __verify_ras_table_checksum -- verify the RAS EEPROM table checksum
>>   * @control: pointer to control structure
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
>> index edb0195ea2eb8c..430e08ab3313a2 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
>> @@ -29,9 +29,10 @@
>>  struct amdgpu_device;
>>
>>  enum amdgpu_ras_eeprom_err_type {
>> -       AMDGPU_RAS_EEPROM_ERR_PLACE_HOLDER,
>> +       AMDGPU_RAS_EEPROM_ERR_NA,
>>         AMDGPU_RAS_EEPROM_ERR_RECOVERABLE,
>> -       AMDGPU_RAS_EEPROM_ERR_NON_RECOVERABLE
>> +       AMDGPU_RAS_EEPROM_ERR_NON_RECOVERABLE,
>> +       AMDGPU_RAS_EEPROM_ERR_COUNT,
>>  };
>>
>>  struct amdgpu_ras_eeprom_table_header {
>> @@ -121,4 +122,9 @@ int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
>>
>>  inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
>>
>> +void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control *control);
>> +
>> +extern const struct file_operations amdgpu_ras_debugfs_eeprom_size_ops;
>> +extern const struct file_operations amdgpu_ras_debugfs_eeprom_table_ops;
>> +
>>  #endif // _AMDGPU_RAS_EEPROM_H
>> --
>> 2.32.0
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cluben.tuikov%40amd.com%7C44ce5499d22045e9181108d92cfcab41%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637590285983174149%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=x5YnmU8DjF1AwNdP6s04B0%2F47%2BxOxaZKou2cairZ3t0%3D&amp;reserved=0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 20/40] drm/amdgpu: EEPROM respects I2C quirks
  2021-06-11 17:17     ` Luben Tuikov
@ 2021-06-11 17:37       ` Luben Tuikov
  0 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-11 17:37 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Andrey Grodzovsky, Lijo Lazar, amd-gfx list, Stanley Yang,
	Alexander Deucher, Jean Delvare, Hawking Zhang

On 2021-06-11 1:17 p.m., Luben Tuikov wrote:
> On 2021-06-11 1:01 p.m., Alex Deucher wrote:
>> On Tue, Jun 8, 2021 at 5:40 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>>> Consult the i2c_adapter.quirks table for
>>> the maximum read/write data length per bus
>>> transaction. Do not exceed this transaction
>>> limit.
>>>
>>> Cc: Jean Delvare <jdelvare@suse.de>
>>> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
>>> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
>>> Cc: Lijo Lazar <Lijo.Lazar@amd.com>
>>> Cc: Stanley Yang <Stanley.Yang@amd.com>
>>> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
>>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c | 80 +++++++++++++++++-----
>>>  1 file changed, 64 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
>>> index 7fdb5bd2fc8bc8..94aeda1c7f8ca0 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c
>>> @@ -32,20 +32,9 @@
>>>
>>>  #define EEPROM_OFFSET_SIZE 2
>>>
>>> -/**
>>> - * amdgpu_eeprom_xfer -- Read/write from/to an I2C EEPROM device
>>> - * @i2c_adap: pointer to the I2C adapter to use
>>> - * @slave_addr: I2C address of the slave device
>>> - * @eeprom_addr: EEPROM address from which to read/write
>>> - * @eeprom_buf: pointer to data buffer to read into/write from
>>> - * @buf_size: the size of @eeprom_buf
>>> - * @read: True if reading from the EEPROM, false if writing
>>> - *
>>> - * Returns the number of bytes read/written; -errno on error.
>>> - */
>>> -int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>>> -                      u16 slave_addr, u16 eeprom_addr,
>>> -                      u8 *eeprom_buf, u16 buf_size, bool read)
>>> +static int __amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>>> +                               u16 slave_addr, u16 eeprom_addr,
>>> +                               u8 *eeprom_buf, u16 buf_size, bool read)
>>>  {
>>>         u8 eeprom_offset_buf[EEPROM_OFFSET_SIZE];
>>>         struct i2c_msg msgs[] = {
>>> @@ -65,8 +54,8 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>>>         u16 len;
>>>
>>>         r = 0;
>>> -       for (len = 0; buf_size > 0;
>>> -            buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
>>> +       for ( ; buf_size > 0;
>>> +             buf_size -= len, eeprom_addr += len, eeprom_buf += len) {
>>>                 /* Set the EEPROM address we want to write to/read from.
>>>                  */
>>>                 msgs[0].buf[0] = (eeprom_addr >> 8) & 0xff;
>>> @@ -120,3 +109,62 @@ int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>>>
>>>         return r < 0 ? r : eeprom_buf - p;
>>>  }
>>> +
>>> +/**
>>> + * amdgpu_eeprom_xfer -- Read/write from/to an I2C EEPROM device
>>> + * @i2c_adap: pointer to the I2C adapter to use
>>> + * @slave_addr: I2C address of the slave device
>>> + * @eeprom_addr: EEPROM address from which to read/write
>>> + * @eeprom_buf: pointer to data buffer to read into/write from
>>> + * @buf_size: the size of @eeprom_buf
>>> + * @read: True if reading from the EEPROM, false if writing
>>> + *
>>> + * Returns the number of bytes read/written; -errno on error.
>>> + */
>>> +int amdgpu_eeprom_xfer(struct i2c_adapter *i2c_adap,
>>> +                      u16 slave_addr, u16 eeprom_addr,
>>> +                      u8 *eeprom_buf, u16 buf_size, bool read)
>>> +{
>>> +       const struct i2c_adapter_quirks *quirks = i2c_adap->quirks;
>>> +       u16 limit;
>>> +
>>> +       if (!quirks)
>>> +               limit = 0;
>>> +       else if (read)
>>> +               limit = quirks->max_read_len;
>>> +       else
>>> +               limit = quirks->max_write_len;
>>> +
>>> +       if (limit == 0) {
>>> +               return __amdgpu_eeprom_xfer(i2c_adap, slave_addr, eeprom_addr,
>>> +                                           eeprom_buf, buf_size, read);
>>> +       } else if (limit <= EEPROM_OFFSET_SIZE) {
>>> +               dev_err_ratelimited(&i2c_adap->dev,
>>> +                                   "maddr:0x%04X size:0x%02X:quirk max_%s_len must be > %d",
>>> +                                   eeprom_addr, buf_size,
>>> +                                   read ? "read" : "write", EEPROM_OFFSET_SIZE);
>>> +               return -EINVAL;
>> I presume we handle this case properly at higher levels (i.e., split
>> up EEPROM updates into smaller transactions)?
> Absolutely we do.
> (We break it down twice: once per this limit and again per page size and page boundary. It'll work always. :-) )
>
> But this is different--this means that the user has set a limit less than 2, which means we can't even send a set-address phase to set the EEPROM memory address offset we want to read or write from, and thus the chattiness.
>
> I just noticed that it is less-than-or-equal, which means the smallest limit the user can set which would work is 3. But 2 would also work, then all transfers would be 2 bytes long. Does it matter? I guess I can change this from LTE to LT, to mean that a minimum transfer of 2 is the smallest we support. I've changed it to LT. :-)

Ooops, no!
It was correct the way I had it.
It has to be LTE due to the comment below, else the min(0, u16) is 0 and we'll not send anything. :-)

Regards,
Luben

>
> Regards,
> Luben
>
>> Alex
>>
>>
>>> +       } else {
>>> +               u16 ps; /* Partial size */
>>> +               int res = 0, r;
>>> +
>>> +               /* The "limit" includes all data bytes sent/received,
>>> +                * which would include the EEPROM_OFFSET_SIZE bytes.
>>> +                * Account for them here.
>>> +                */
>>> +               limit -= EEPROM_OFFSET_SIZE;
>>> +               for ( ; buf_size > 0;
>>> +                     buf_size -= ps, eeprom_addr += ps, eeprom_buf += ps) {
>>> +                       ps = min(limit, buf_size);
>>> +
>>> +                       r = __amdgpu_eeprom_xfer(i2c_adap,
>>> +                                                slave_addr, eeprom_addr,
>>> +                                                eeprom_buf, ps, read);
>>> +                       if (r < 0)
>>> +                               return r;
>>> +                       res += r;
>>> +               }
>>> +
>>> +               return res;
>>> +       }
>>> +}
>>> --
>>> 2.32.0
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cluben.tuikov%40amd.com%7Cc8502a7f4dd94666468408d92cfa95e6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637590277035962948%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=UsBaf7trds%2BjmJ8yhIaMoLNdq2Rxk3EXY5jztgzjFL0%3D&amp;reserved=0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 38/40] drm/amdgpu: RAS EEPROM table is now in debugfs
  2021-06-11 17:30     ` Luben Tuikov
@ 2021-06-11 17:51       ` Alex Deucher
  2021-06-11 18:06         ` Luben Tuikov
  0 siblings, 1 reply; 74+ messages in thread
From: Alex Deucher @ 2021-06-11 17:51 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Andrey Grodzovsky, Xinhui Pan, amd-gfx list, Alexander Deucher,
	John Clements, Hawking Zhang

On Fri, Jun 11, 2021 at 1:30 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> On 2021-06-11 1:16 p.m., Alex Deucher wrote:
> > On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
> >> Add "ras_eeprom_size" file in debugfs, which
> >> reports the maximum size allocated to the RAS
> >> table in EEROM, as the number of bytes and the
> >> number of records it could store. For instance,
> >>
> >> $cat /sys/kernel/debug/dri/0/ras/ras_eeprom_size
> >> 262144 bytes or 10921 records
> >> $_
> >>
> >> Add "ras_eeprom_table" file in debugfs, which
> >> dumps the RAS table stored EEPROM, in a formatted
> >> way. For instance,
> >>
> >> $cat ras_eeprom_table
> >>  Signature    Version  FirstOffs       Size   Checksum
> >> 0x414D4452 0x00010000 0x00000014 0x000000EC 0x000000DA
> >> Index  Offset ErrType Bank/CU          TimeStamp      Offs/Addr MemChl MCUMCID    RetiredPage
> >>     0 0x00014      ue    0x00 0x00000000607608DC 0x000000000000   0x00    0x00 0x000000000000
> >>     1 0x0002C      ue    0x00 0x00000000607608DC 0x000000001000   0x00    0x00 0x000000000001
> >>     2 0x00044      ue    0x00 0x00000000607608DC 0x000000002000   0x00    0x00 0x000000000002
> >>     3 0x0005C      ue    0x00 0x00000000607608DC 0x000000003000   0x00    0x00 0x000000000003
> >>     4 0x00074      ue    0x00 0x00000000607608DC 0x000000004000   0x00    0x00 0x000000000004
> >>     5 0x0008C      ue    0x00 0x00000000607608DC 0x000000005000   0x00    0x00 0x000000000005
> >>     6 0x000A4      ue    0x00 0x00000000607608DC 0x000000006000   0x00    0x00 0x000000000006
> >>     7 0x000BC      ue    0x00 0x00000000607608DC 0x000000007000   0x00    0x00 0x000000000007
> >>     8 0x000D4      ue    0x00 0x00000000607608DD 0x000000008000   0x00    0x00 0x000000000008
> >> $_
> >>
> >> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> >> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> >> Cc: John Clements <john.clements@amd.com>
> >> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> >> Cc: Xinhui Pan <xinhui.pan@amd.com>
> >> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> > Seems like a useful feature.  Just a few comments below.
> >
> > Alex
> >
> >
> >> ---
> >>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |  12 +-
> >>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h       |   1 +
> >>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 241 +++++++++++++++++-
> >>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  10 +-
> >>  4 files changed, 252 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> >> index 1424f2cc2076c1..d791a360a92366 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> >> @@ -404,9 +404,9 @@ static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f,
> >>                 /* umc ce/ue error injection for a bad page is not allowed */
> >>                 if ((data.head.block == AMDGPU_RAS_BLOCK__UMC) &&
> >>                     amdgpu_ras_check_bad_page(adev, data.inject.address)) {
> >> -                       dev_warn(adev->dev, "RAS WARN: 0x%llx has been marked "
> >> -                                       "as bad before error injection!\n",
> >> -                                       data.inject.address);
> >> +                       dev_warn(adev->dev, "RAS WARN: inject: 0x%llx has "
> >> +                                "already been marked as bad!\n",
> >> +                                data.inject.address);
> > This seems unrelated to this patch.
>
> It's just cosmetic fix, to correctly align, as it seems that the previous alignment was arbitrary.
> Just pressing TAB in Emacs does wonders. :-)
>
> I was in this file and decided to fix this. It's just cosmetic. No functional change.
>
> >
> >>                         break;
> >>                 }
> >>
> >> @@ -1301,6 +1301,12 @@ static struct dentry *amdgpu_ras_debugfs_create_ctrl_node(struct amdgpu_device *
> >>                            &con->bad_page_cnt_threshold);
> >>         debugfs_create_x32("ras_hw_enabled", 0444, dir, &adev->ras_hw_enabled);
> >>         debugfs_create_x32("ras_enabled", 0444, dir, &adev->ras_enabled);
> >> +       debugfs_create_file("ras_eeprom_size", S_IRUGO, dir, adev,
> >> +                           &amdgpu_ras_debugfs_eeprom_size_ops);
> >> +       con->de_ras_eeprom_table = debugfs_create_file("ras_eeprom_table",
> >> +                                                      S_IRUGO, dir, adev,
> >> +                                                      &amdgpu_ras_debugfs_eeprom_table_ops);
> >> +       amdgpu_ras_debugfs_set_ret_size(&con->eeprom_control);
> >>
> >>         /*
> >>          * After one uncorrectable error happens, usually GPU recovery will
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> >> index 256cea5d34f2b6..283afd791db107 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> >> @@ -318,6 +318,7 @@ struct amdgpu_ras {
> >>         /* sysfs */
> >>         struct device_attribute features_attr;
> >>         struct bin_attribute badpages_attr;
> >> +       struct dentry *de_ras_eeprom_table;
> >>         /* block array */
> >>         struct ras_manager *objs;
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> >> index dc4a845a32404c..677e379f5fb5e9 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> >> @@ -27,6 +27,8 @@
> >>  #include <linux/bits.h>
> >>  #include "atom.h"
> >>  #include "amdgpu_eeprom.h"
> >> +#include <linux/debugfs.h>
> >> +#include <linux/uaccess.h>
> >>
> >>  #define EEPROM_I2C_MADDR_VEGA20         0x0
> >>  #define EEPROM_I2C_MADDR_ARCTURUS       0x40000
> >> @@ -70,6 +72,13 @@
> >>  #define RAS_OFFSET_TO_INDEX(_C, _O) (((_O) - \
> >>                                       (_C)->ras_record_offset) / RAS_TABLE_RECORD_SIZE)
> >>
> >> +/* Given a 0-based relative record index, 0, 1, 2, ..., etc., off
> >> + * of "fri", return the absolute record index off of the end of
> >> + * the table header.
> >> + */
> >> +#define RAS_RI_TO_AI(_C, _I) (((_I) + (_C)->ras_fri) % \
> >> +                             (_C)->ras_max_record_count)
> >> +
> >>  #define RAS_NUM_RECS(_tbl_hdr)  (((_tbl_hdr)->tbl_size - \
> >>                                   RAS_TABLE_HEADER_SIZE) / RAS_TABLE_RECORD_SIZE)
> >>
> >> @@ -77,13 +86,10 @@
> >>
> >>  static bool __is_ras_eeprom_supported(struct amdgpu_device *adev)
> >>  {
> >> -       if ((adev->asic_type == CHIP_VEGA20) ||
> >> -           (adev->asic_type == CHIP_ARCTURUS) ||
> >> -           (adev->asic_type == CHIP_SIENNA_CICHLID) ||
> >> -           (adev->asic_type == CHIP_ALDEBARAN))
> >> -               return true;
> >> -
> >> -       return false;
> >> +       return  adev->asic_type == CHIP_VEGA20 ||
> >> +               adev->asic_type == CHIP_ARCTURUS ||
> >> +               adev->asic_type == CHIP_SIENNA_CICHLID ||
> >> +               adev->asic_type == CHIP_ALDEBARAN;
> > Unrelated whitespace change.
>
> It's more readable and succinct like this, no?
>
> Do you want me to revert these? I mean, they're pleasing to have and change no functionality, and since I was in this file...
>

Don't worry about respinning to break these out in this patch, but in
general it's better to keep formatting cleanups separate from
functional changes; makes it easier to review the functional changes.

Acked-by: Alex Deucher <alexander.deucher@amd.com>

Alex


> Regards,
> Luben
>
> >
> >>  }
> >>
> >>  static bool __get_eeprom_i2c_addr_arct(struct amdgpu_device *adev,
> >> @@ -258,6 +264,8 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control)
> >>         control->ras_num_recs = 0;
> >>         control->ras_fri = 0;
> >>
> >> +       amdgpu_ras_debugfs_set_ret_size(control);
> >> +
> >>         mutex_unlock(&control->ras_tbl_mutex);
> >>
> >>         return res;
> >> @@ -591,6 +599,8 @@ int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
> >>         res = amdgpu_ras_eeprom_append_table(control, record, num);
> >>         if (!res)
> >>                 res = amdgpu_ras_eeprom_update_header(control);
> >> +       if (!res)
> >> +               amdgpu_ras_debugfs_set_ret_size(control);
> >>
> >>         mutex_unlock(&control->ras_tbl_mutex);
> >>         return res;
> >> @@ -734,6 +744,223 @@ inline uint32_t amdgpu_ras_eeprom_max_record_count(void)
> >>         return RAS_MAX_RECORD_COUNT;
> >>  }
> >>
> >> +static ssize_t
> >> +amdgpu_ras_debugfs_eeprom_size_read(struct file *f, char __user *buf,
> >> +                                   size_t size, loff_t *pos)
> >> +{
> >> +       struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
> >> +       struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
> >> +       struct amdgpu_ras_eeprom_control *control = ras ? &ras->eeprom_control : NULL;
> >> +       u8 data[50];
> >> +       int res;
> >> +
> >> +       if (!size)
> >> +               return size;
> >> +
> >> +       if (!ras || !control) {
> >> +               res = snprintf(data, sizeof(data), "Not supported\n");
> >> +       } else {
> >> +               res = snprintf(data, sizeof(data), "%d bytes or %d records\n",
> >> +                              RAS_TBL_SIZE_BYTES, control->ras_max_record_count);
> >> +       }
> >> +
> >> +       if (*pos >= res)
> >> +               return 0;
> >> +
> >> +       res -= *pos;
> >> +       res = min_t(size_t, res, size);
> >> +
> >> +       if (copy_to_user(buf, &data[*pos], res))
> >> +               return -EINVAL;
> >> +
> >> +       *pos += res;
> >> +
> >> +       return res;
> >> +}
> >> +
> >> +const struct file_operations amdgpu_ras_debugfs_eeprom_size_ops = {
> >> +       .owner = THIS_MODULE,
> >> +       .read = amdgpu_ras_debugfs_eeprom_size_read,
> >> +       .write = NULL,
> >> +       .llseek = default_llseek,
> >> +};
> >> +
> >> +static const char *tbl_hdr_str = " Signature    Version  FirstOffs       Size   Checksum\n";
> >> +static const char *tbl_hdr_fmt = "0x%08X 0x%08X 0x%08X 0x%08X 0x%08X\n";
> >> +#define tbl_hdr_fmt_size (5 * (2+8) + 4 + 1)
> >> +static const char *rec_hdr_str = "Index  Offset ErrType Bank/CU          TimeStamp      Offs/Addr MemChl MCUMCID    RetiredPage\n";
> >> +static const char *rec_hdr_fmt = "%5d 0x%05X %7s    0x%02X 0x%016llX 0x%012llX   0x%02X    0x%02X 0x%012llX\n";
> >> +#define rec_hdr_fmt_size (5 + 1 + 7 + 1 + 7 + 1 + 7 + 1 + 18 + 1 + 14 + 1 + 6 + 1 + 7 + 1 + 14 + 1)
> >> +
> >> +static const char *record_err_type_str[AMDGPU_RAS_EEPROM_ERR_COUNT] = {
> >> +       "ignore",
> >> +       "re",
> >> +       "ue",
> >> +};
> >> +
> >> +static loff_t amdgpu_ras_debugfs_table_size(struct amdgpu_ras_eeprom_control *control)
> >> +{
> >> +       return strlen(tbl_hdr_str) + tbl_hdr_fmt_size +
> >> +               strlen(rec_hdr_str) + rec_hdr_fmt_size * control->ras_num_recs;
> >> +}
> >> +
> >> +void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control *control)
> >> +{
> >> +       struct amdgpu_ras *ras = container_of(control, struct amdgpu_ras,
> >> +                                             eeprom_control);
> >> +       struct dentry *de = ras->de_ras_eeprom_table;
> >> +
> >> +       if (de)
> >> +               d_inode(de)->i_size = amdgpu_ras_debugfs_table_size(control);
> >> +}
> >> +
> >> +static ssize_t amdgpu_ras_debugfs_table_read(struct file *f, char __user *buf,
> >> +                                            size_t size, loff_t *pos)
> >> +{
> >> +       struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
> >> +       struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
> >> +       struct amdgpu_ras_eeprom_control *control = &ras->eeprom_control;
> >> +       const size_t orig_size = size;
> >> +       int res = -EINVAL;
> >> +       size_t data_len;
> >> +
> >> +       mutex_lock(&control->ras_tbl_mutex);
> >> +
> >> +       /* We want *pos - data_len > 0, which means there's
> >> +        * bytes to be printed from data.
> >> +        */
> >> +       data_len = strlen(tbl_hdr_str);
> >> +       if (*pos < data_len) {
> >> +               data_len -= *pos;
> >> +               data_len = min_t(size_t, data_len, size);
> >> +               if (copy_to_user(buf, &tbl_hdr_str[*pos], data_len))
> >> +                       goto Out;
> >> +               buf += data_len;
> >> +               size -= data_len;
> >> +               *pos += data_len;
> >> +       }
> >> +
> >> +       data_len = strlen(tbl_hdr_str) + tbl_hdr_fmt_size;
> >> +       if (*pos < data_len && size > 0) {
> >> +               u8 data[tbl_hdr_fmt_size + 1];
> >> +               loff_t lpos;
> >> +
> >> +               snprintf(data, sizeof(data), tbl_hdr_fmt,
> >> +                        control->tbl_hdr.header,
> >> +                        control->tbl_hdr.version,
> >> +                        control->tbl_hdr.first_rec_offset,
> >> +                        control->tbl_hdr.tbl_size,
> >> +                        control->tbl_hdr.checksum);
> >> +
> >> +               data_len -= *pos;
> >> +               data_len = min_t(size_t, data_len, size);
> >> +               lpos = *pos - strlen(tbl_hdr_str);
> >> +               if (copy_to_user(buf, &data[lpos], data_len))
> >> +                       goto Out;
> >> +               buf += data_len;
> >> +               size -= data_len;
> >> +               *pos += data_len;
> >> +       }
> >> +
> >> +       data_len = strlen(tbl_hdr_str) + tbl_hdr_fmt_size + strlen(rec_hdr_str);
> >> +       if (*pos < data_len && size > 0) {
> >> +               loff_t lpos;
> >> +
> >> +               data_len -= *pos;
> >> +               data_len = min_t(size_t, data_len, size);
> >> +               lpos = *pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size;
> >> +               if (copy_to_user(buf, &rec_hdr_str[lpos], data_len))
> >> +                       goto Out;
> >> +               buf += data_len;
> >> +               size -= data_len;
> >> +               *pos += data_len;
> >> +       }
> >> +
> >> +       data_len = amdgpu_ras_debugfs_table_size(control);
> >> +       if (*pos < data_len && size > 0) {
> >> +               u8 dare[RAS_TABLE_RECORD_SIZE];
> >> +               u8 data[rec_hdr_fmt_size + 1];
> >> +               /* Find the starting record index
> >> +                */
> >> +               int s = (*pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size -
> >> +                        strlen(rec_hdr_str)) / rec_hdr_fmt_size;
> >> +               int r = (*pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size -
> >> +                        strlen(rec_hdr_str)) % rec_hdr_fmt_size;
> >> +               struct eeprom_table_record record;
> >> +
> >> +               for ( ; size > 0 && s < control->ras_num_recs; s++) {
> >> +                       u32 ai = RAS_RI_TO_AI(control, s);
> >> +                       /* Read a single record
> >> +                        */
> >> +                       res = __amdgpu_ras_eeprom_read(control, dare, ai, 1);
> >> +                       if (res)
> >> +                               goto Out;
> >> +                       __decode_table_record_from_buf(control, &record, dare);
> >> +                       snprintf(data, sizeof(data), rec_hdr_fmt,
> >> +                                s,
> >> +                                RAS_INDEX_TO_OFFSET(control, ai),
> >> +                                record_err_type_str[record.err_type],
> >> +                                record.bank,
> >> +                                record.ts,
> >> +                                record.offset,
> >> +                                record.mem_channel,
> >> +                                record.mcumc_id,
> >> +                                record.retired_page);
> >> +
> >> +                       data_len = min_t(size_t, rec_hdr_fmt_size - r, size);
> >> +                       if (copy_to_user(buf, &data[r], data_len))
> >> +                               return -EINVAL;
> >> +                       buf += data_len;
> >> +                       size -= data_len;
> >> +                       *pos += data_len;
> >> +                       r = 0;
> >> +               }
> >> +       }
> >> +       res = 0;
> >> +Out:
> >> +       mutex_unlock(&control->ras_tbl_mutex);
> >> +       return res < 0 ? res : orig_size - size;
> >> +}
> >> +
> >> +static ssize_t
> >> +amdgpu_ras_debugfs_eeprom_table_read(struct file *f, char __user *buf,
> >> +                                    size_t size, loff_t *pos)
> >> +{
> >> +       struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
> >> +       struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
> >> +       struct amdgpu_ras_eeprom_control *control = ras ? &ras->eeprom_control : NULL;
> >> +       u8 data[81];
> >> +       int res;
> >> +
> >> +       if (!size)
> >> +               return size;
> >> +
> >> +       if (!ras || !control) {
> >> +               res = snprintf(data, sizeof(data), "Not supported\n");
> >> +               if (*pos >= res)
> >> +                       return 0;
> >> +
> >> +               res -= *pos;
> >> +               res = min_t(size_t, res, size);
> >> +
> >> +               if (copy_to_user(buf, &data[*pos], res))
> >> +                       return -EINVAL;
> >> +
> >> +               *pos += res;
> >> +
> >> +               return res;
> >> +       } else {
> >> +               return amdgpu_ras_debugfs_table_read(f, buf, size, pos);
> >> +       }
> >> +}
> >> +
> >> +const struct file_operations amdgpu_ras_debugfs_eeprom_table_ops = {
> >> +       .owner = THIS_MODULE,
> >> +       .read = amdgpu_ras_debugfs_eeprom_table_read,
> >> +       .write = NULL,
> >> +       .llseek = default_llseek,
> >> +};
> >> +
> >>  /**
> >>   * __verify_ras_table_checksum -- verify the RAS EEPROM table checksum
> >>   * @control: pointer to control structure
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> >> index edb0195ea2eb8c..430e08ab3313a2 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> >> @@ -29,9 +29,10 @@
> >>  struct amdgpu_device;
> >>
> >>  enum amdgpu_ras_eeprom_err_type {
> >> -       AMDGPU_RAS_EEPROM_ERR_PLACE_HOLDER,
> >> +       AMDGPU_RAS_EEPROM_ERR_NA,
> >>         AMDGPU_RAS_EEPROM_ERR_RECOVERABLE,
> >> -       AMDGPU_RAS_EEPROM_ERR_NON_RECOVERABLE
> >> +       AMDGPU_RAS_EEPROM_ERR_NON_RECOVERABLE,
> >> +       AMDGPU_RAS_EEPROM_ERR_COUNT,
> >>  };
> >>
> >>  struct amdgpu_ras_eeprom_table_header {
> >> @@ -121,4 +122,9 @@ int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
> >>
> >>  inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
> >>
> >> +void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control *control);
> >> +
> >> +extern const struct file_operations amdgpu_ras_debugfs_eeprom_size_ops;
> >> +extern const struct file_operations amdgpu_ras_debugfs_eeprom_table_ops;
> >> +
> >>  #endif // _AMDGPU_RAS_EEPROM_H
> >> --
> >> 2.32.0
> >>
> >> _______________________________________________
> >> amd-gfx mailing list
> >> amd-gfx@lists.freedesktop.org
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cluben.tuikov%40amd.com%7C44ce5499d22045e9181108d92cfcab41%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637590285983174149%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=x5YnmU8DjF1AwNdP6s04B0%2F47%2BxOxaZKou2cairZ3t0%3D&amp;reserved=0
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 38/40] drm/amdgpu: RAS EEPROM table is now in debugfs
  2021-06-11 17:51       ` Alex Deucher
@ 2021-06-11 18:06         ` Luben Tuikov
  0 siblings, 0 replies; 74+ messages in thread
From: Luben Tuikov @ 2021-06-11 18:06 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Andrey Grodzovsky, Xinhui Pan, amd-gfx list, Alexander Deucher,
	John Clements, Hawking Zhang

On 2021-06-11 1:51 p.m., Alex Deucher wrote:
> On Fri, Jun 11, 2021 at 1:30 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>> On 2021-06-11 1:16 p.m., Alex Deucher wrote:
>>> On Tue, Jun 8, 2021 at 5:41 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>>>> Add "ras_eeprom_size" file in debugfs, which
>>>> reports the maximum size allocated to the RAS
>>>> table in EEROM, as the number of bytes and the
>>>> number of records it could store. For instance,
>>>>
>>>> $cat /sys/kernel/debug/dri/0/ras/ras_eeprom_size
>>>> 262144 bytes or 10921 records
>>>> $_
>>>>
>>>> Add "ras_eeprom_table" file in debugfs, which
>>>> dumps the RAS table stored EEPROM, in a formatted
>>>> way. For instance,
>>>>
>>>> $cat ras_eeprom_table
>>>>  Signature    Version  FirstOffs       Size   Checksum
>>>> 0x414D4452 0x00010000 0x00000014 0x000000EC 0x000000DA
>>>> Index  Offset ErrType Bank/CU          TimeStamp      Offs/Addr MemChl MCUMCID    RetiredPage
>>>>     0 0x00014      ue    0x00 0x00000000607608DC 0x000000000000   0x00    0x00 0x000000000000
>>>>     1 0x0002C      ue    0x00 0x00000000607608DC 0x000000001000   0x00    0x00 0x000000000001
>>>>     2 0x00044      ue    0x00 0x00000000607608DC 0x000000002000   0x00    0x00 0x000000000002
>>>>     3 0x0005C      ue    0x00 0x00000000607608DC 0x000000003000   0x00    0x00 0x000000000003
>>>>     4 0x00074      ue    0x00 0x00000000607608DC 0x000000004000   0x00    0x00 0x000000000004
>>>>     5 0x0008C      ue    0x00 0x00000000607608DC 0x000000005000   0x00    0x00 0x000000000005
>>>>     6 0x000A4      ue    0x00 0x00000000607608DC 0x000000006000   0x00    0x00 0x000000000006
>>>>     7 0x000BC      ue    0x00 0x00000000607608DC 0x000000007000   0x00    0x00 0x000000000007
>>>>     8 0x000D4      ue    0x00 0x00000000607608DD 0x000000008000   0x00    0x00 0x000000000008
>>>> $_
>>>>
>>>> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
>>>> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
>>>> Cc: John Clements <john.clements@amd.com>
>>>> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
>>>> Cc: Xinhui Pan <xinhui.pan@amd.com>
>>>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>>> Seems like a useful feature.  Just a few comments below.
>>>
>>> Alex
>>>
>>>
>>>> ---
>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |  12 +-
>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h       |   1 +
>>>>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 241 +++++++++++++++++-
>>>>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h    |  10 +-
>>>>  4 files changed, 252 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
>>>> index 1424f2cc2076c1..d791a360a92366 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
>>>> @@ -404,9 +404,9 @@ static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f,
>>>>                 /* umc ce/ue error injection for a bad page is not allowed */
>>>>                 if ((data.head.block == AMDGPU_RAS_BLOCK__UMC) &&
>>>>                     amdgpu_ras_check_bad_page(adev, data.inject.address)) {
>>>> -                       dev_warn(adev->dev, "RAS WARN: 0x%llx has been marked "
>>>> -                                       "as bad before error injection!\n",
>>>> -                                       data.inject.address);
>>>> +                       dev_warn(adev->dev, "RAS WARN: inject: 0x%llx has "
>>>> +                                "already been marked as bad!\n",
>>>> +                                data.inject.address);
>>> This seems unrelated to this patch.
>> It's just cosmetic fix, to correctly align, as it seems that the previous alignment was arbitrary.
>> Just pressing TAB in Emacs does wonders. :-)
>>
>> I was in this file and decided to fix this. It's just cosmetic. No functional change.
>>
>>>>                         break;
>>>>                 }
>>>>
>>>> @@ -1301,6 +1301,12 @@ static struct dentry *amdgpu_ras_debugfs_create_ctrl_node(struct amdgpu_device *
>>>>                            &con->bad_page_cnt_threshold);
>>>>         debugfs_create_x32("ras_hw_enabled", 0444, dir, &adev->ras_hw_enabled);
>>>>         debugfs_create_x32("ras_enabled", 0444, dir, &adev->ras_enabled);
>>>> +       debugfs_create_file("ras_eeprom_size", S_IRUGO, dir, adev,
>>>> +                           &amdgpu_ras_debugfs_eeprom_size_ops);
>>>> +       con->de_ras_eeprom_table = debugfs_create_file("ras_eeprom_table",
>>>> +                                                      S_IRUGO, dir, adev,
>>>> +                                                      &amdgpu_ras_debugfs_eeprom_table_ops);
>>>> +       amdgpu_ras_debugfs_set_ret_size(&con->eeprom_control);
>>>>
>>>>         /*
>>>>          * After one uncorrectable error happens, usually GPU recovery will
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
>>>> index 256cea5d34f2b6..283afd791db107 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
>>>> @@ -318,6 +318,7 @@ struct amdgpu_ras {
>>>>         /* sysfs */
>>>>         struct device_attribute features_attr;
>>>>         struct bin_attribute badpages_attr;
>>>> +       struct dentry *de_ras_eeprom_table;
>>>>         /* block array */
>>>>         struct ras_manager *objs;
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>>>> index dc4a845a32404c..677e379f5fb5e9 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>>>> @@ -27,6 +27,8 @@
>>>>  #include <linux/bits.h>
>>>>  #include "atom.h"
>>>>  #include "amdgpu_eeprom.h"
>>>> +#include <linux/debugfs.h>
>>>> +#include <linux/uaccess.h>
>>>>
>>>>  #define EEPROM_I2C_MADDR_VEGA20         0x0
>>>>  #define EEPROM_I2C_MADDR_ARCTURUS       0x40000
>>>> @@ -70,6 +72,13 @@
>>>>  #define RAS_OFFSET_TO_INDEX(_C, _O) (((_O) - \
>>>>                                       (_C)->ras_record_offset) / RAS_TABLE_RECORD_SIZE)
>>>>
>>>> +/* Given a 0-based relative record index, 0, 1, 2, ..., etc., off
>>>> + * of "fri", return the absolute record index off of the end of
>>>> + * the table header.
>>>> + */
>>>> +#define RAS_RI_TO_AI(_C, _I) (((_I) + (_C)->ras_fri) % \
>>>> +                             (_C)->ras_max_record_count)
>>>> +
>>>>  #define RAS_NUM_RECS(_tbl_hdr)  (((_tbl_hdr)->tbl_size - \
>>>>                                   RAS_TABLE_HEADER_SIZE) / RAS_TABLE_RECORD_SIZE)
>>>>
>>>> @@ -77,13 +86,10 @@
>>>>
>>>>  static bool __is_ras_eeprom_supported(struct amdgpu_device *adev)
>>>>  {
>>>> -       if ((adev->asic_type == CHIP_VEGA20) ||
>>>> -           (adev->asic_type == CHIP_ARCTURUS) ||
>>>> -           (adev->asic_type == CHIP_SIENNA_CICHLID) ||
>>>> -           (adev->asic_type == CHIP_ALDEBARAN))
>>>> -               return true;
>>>> -
>>>> -       return false;
>>>> +       return  adev->asic_type == CHIP_VEGA20 ||
>>>> +               adev->asic_type == CHIP_ARCTURUS ||
>>>> +               adev->asic_type == CHIP_SIENNA_CICHLID ||
>>>> +               adev->asic_type == CHIP_ALDEBARAN;
>>> Unrelated whitespace change.
>> It's more readable and succinct like this, no?
>>
>> Do you want me to revert these? I mean, they're pleasing to have and change no functionality, and since I was in this file...
>>
> Don't worry about respinning to break these out in this patch, but in
> general it's better to keep formatting cleanups separate from
> functional changes; makes it easier to review the functional changes.
>
> Acked-by: Alex Deucher <alexander.deucher@amd.com>

Okay, great--thanks!

I'll repost the whole set with the squashed patch and the new abort-fixing one for long transfers as soon as the dust settles. (FWIW, it's now available on Gitlab. :-) )

Regards,
Luben

>
> Alex
>
>
>> Regards,
>> Luben
>>
>>>>  }
>>>>
>>>>  static bool __get_eeprom_i2c_addr_arct(struct amdgpu_device *adev,
>>>> @@ -258,6 +264,8 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control)
>>>>         control->ras_num_recs = 0;
>>>>         control->ras_fri = 0;
>>>>
>>>> +       amdgpu_ras_debugfs_set_ret_size(control);
>>>> +
>>>>         mutex_unlock(&control->ras_tbl_mutex);
>>>>
>>>>         return res;
>>>> @@ -591,6 +599,8 @@ int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
>>>>         res = amdgpu_ras_eeprom_append_table(control, record, num);
>>>>         if (!res)
>>>>                 res = amdgpu_ras_eeprom_update_header(control);
>>>> +       if (!res)
>>>> +               amdgpu_ras_debugfs_set_ret_size(control);
>>>>
>>>>         mutex_unlock(&control->ras_tbl_mutex);
>>>>         return res;
>>>> @@ -734,6 +744,223 @@ inline uint32_t amdgpu_ras_eeprom_max_record_count(void)
>>>>         return RAS_MAX_RECORD_COUNT;
>>>>  }
>>>>
>>>> +static ssize_t
>>>> +amdgpu_ras_debugfs_eeprom_size_read(struct file *f, char __user *buf,
>>>> +                                   size_t size, loff_t *pos)
>>>> +{
>>>> +       struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
>>>> +       struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
>>>> +       struct amdgpu_ras_eeprom_control *control = ras ? &ras->eeprom_control : NULL;
>>>> +       u8 data[50];
>>>> +       int res;
>>>> +
>>>> +       if (!size)
>>>> +               return size;
>>>> +
>>>> +       if (!ras || !control) {
>>>> +               res = snprintf(data, sizeof(data), "Not supported\n");
>>>> +       } else {
>>>> +               res = snprintf(data, sizeof(data), "%d bytes or %d records\n",
>>>> +                              RAS_TBL_SIZE_BYTES, control->ras_max_record_count);
>>>> +       }
>>>> +
>>>> +       if (*pos >= res)
>>>> +               return 0;
>>>> +
>>>> +       res -= *pos;
>>>> +       res = min_t(size_t, res, size);
>>>> +
>>>> +       if (copy_to_user(buf, &data[*pos], res))
>>>> +               return -EINVAL;
>>>> +
>>>> +       *pos += res;
>>>> +
>>>> +       return res;
>>>> +}
>>>> +
>>>> +const struct file_operations amdgpu_ras_debugfs_eeprom_size_ops = {
>>>> +       .owner = THIS_MODULE,
>>>> +       .read = amdgpu_ras_debugfs_eeprom_size_read,
>>>> +       .write = NULL,
>>>> +       .llseek = default_llseek,
>>>> +};
>>>> +
>>>> +static const char *tbl_hdr_str = " Signature    Version  FirstOffs       Size   Checksum\n";
>>>> +static const char *tbl_hdr_fmt = "0x%08X 0x%08X 0x%08X 0x%08X 0x%08X\n";
>>>> +#define tbl_hdr_fmt_size (5 * (2+8) + 4 + 1)
>>>> +static const char *rec_hdr_str = "Index  Offset ErrType Bank/CU          TimeStamp      Offs/Addr MemChl MCUMCID    RetiredPage\n";
>>>> +static const char *rec_hdr_fmt = "%5d 0x%05X %7s    0x%02X 0x%016llX 0x%012llX   0x%02X    0x%02X 0x%012llX\n";
>>>> +#define rec_hdr_fmt_size (5 + 1 + 7 + 1 + 7 + 1 + 7 + 1 + 18 + 1 + 14 + 1 + 6 + 1 + 7 + 1 + 14 + 1)
>>>> +
>>>> +static const char *record_err_type_str[AMDGPU_RAS_EEPROM_ERR_COUNT] = {
>>>> +       "ignore",
>>>> +       "re",
>>>> +       "ue",
>>>> +};
>>>> +
>>>> +static loff_t amdgpu_ras_debugfs_table_size(struct amdgpu_ras_eeprom_control *control)
>>>> +{
>>>> +       return strlen(tbl_hdr_str) + tbl_hdr_fmt_size +
>>>> +               strlen(rec_hdr_str) + rec_hdr_fmt_size * control->ras_num_recs;
>>>> +}
>>>> +
>>>> +void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control *control)
>>>> +{
>>>> +       struct amdgpu_ras *ras = container_of(control, struct amdgpu_ras,
>>>> +                                             eeprom_control);
>>>> +       struct dentry *de = ras->de_ras_eeprom_table;
>>>> +
>>>> +       if (de)
>>>> +               d_inode(de)->i_size = amdgpu_ras_debugfs_table_size(control);
>>>> +}
>>>> +
>>>> +static ssize_t amdgpu_ras_debugfs_table_read(struct file *f, char __user *buf,
>>>> +                                            size_t size, loff_t *pos)
>>>> +{
>>>> +       struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
>>>> +       struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
>>>> +       struct amdgpu_ras_eeprom_control *control = &ras->eeprom_control;
>>>> +       const size_t orig_size = size;
>>>> +       int res = -EINVAL;
>>>> +       size_t data_len;
>>>> +
>>>> +       mutex_lock(&control->ras_tbl_mutex);
>>>> +
>>>> +       /* We want *pos - data_len > 0, which means there's
>>>> +        * bytes to be printed from data.
>>>> +        */
>>>> +       data_len = strlen(tbl_hdr_str);
>>>> +       if (*pos < data_len) {
>>>> +               data_len -= *pos;
>>>> +               data_len = min_t(size_t, data_len, size);
>>>> +               if (copy_to_user(buf, &tbl_hdr_str[*pos], data_len))
>>>> +                       goto Out;
>>>> +               buf += data_len;
>>>> +               size -= data_len;
>>>> +               *pos += data_len;
>>>> +       }
>>>> +
>>>> +       data_len = strlen(tbl_hdr_str) + tbl_hdr_fmt_size;
>>>> +       if (*pos < data_len && size > 0) {
>>>> +               u8 data[tbl_hdr_fmt_size + 1];
>>>> +               loff_t lpos;
>>>> +
>>>> +               snprintf(data, sizeof(data), tbl_hdr_fmt,
>>>> +                        control->tbl_hdr.header,
>>>> +                        control->tbl_hdr.version,
>>>> +                        control->tbl_hdr.first_rec_offset,
>>>> +                        control->tbl_hdr.tbl_size,
>>>> +                        control->tbl_hdr.checksum);
>>>> +
>>>> +               data_len -= *pos;
>>>> +               data_len = min_t(size_t, data_len, size);
>>>> +               lpos = *pos - strlen(tbl_hdr_str);
>>>> +               if (copy_to_user(buf, &data[lpos], data_len))
>>>> +                       goto Out;
>>>> +               buf += data_len;
>>>> +               size -= data_len;
>>>> +               *pos += data_len;
>>>> +       }
>>>> +
>>>> +       data_len = strlen(tbl_hdr_str) + tbl_hdr_fmt_size + strlen(rec_hdr_str);
>>>> +       if (*pos < data_len && size > 0) {
>>>> +               loff_t lpos;
>>>> +
>>>> +               data_len -= *pos;
>>>> +               data_len = min_t(size_t, data_len, size);
>>>> +               lpos = *pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size;
>>>> +               if (copy_to_user(buf, &rec_hdr_str[lpos], data_len))
>>>> +                       goto Out;
>>>> +               buf += data_len;
>>>> +               size -= data_len;
>>>> +               *pos += data_len;
>>>> +       }
>>>> +
>>>> +       data_len = amdgpu_ras_debugfs_table_size(control);
>>>> +       if (*pos < data_len && size > 0) {
>>>> +               u8 dare[RAS_TABLE_RECORD_SIZE];
>>>> +               u8 data[rec_hdr_fmt_size + 1];
>>>> +               /* Find the starting record index
>>>> +                */
>>>> +               int s = (*pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size -
>>>> +                        strlen(rec_hdr_str)) / rec_hdr_fmt_size;
>>>> +               int r = (*pos - strlen(tbl_hdr_str) - tbl_hdr_fmt_size -
>>>> +                        strlen(rec_hdr_str)) % rec_hdr_fmt_size;
>>>> +               struct eeprom_table_record record;
>>>> +
>>>> +               for ( ; size > 0 && s < control->ras_num_recs; s++) {
>>>> +                       u32 ai = RAS_RI_TO_AI(control, s);
>>>> +                       /* Read a single record
>>>> +                        */
>>>> +                       res = __amdgpu_ras_eeprom_read(control, dare, ai, 1);
>>>> +                       if (res)
>>>> +                               goto Out;
>>>> +                       __decode_table_record_from_buf(control, &record, dare);
>>>> +                       snprintf(data, sizeof(data), rec_hdr_fmt,
>>>> +                                s,
>>>> +                                RAS_INDEX_TO_OFFSET(control, ai),
>>>> +                                record_err_type_str[record.err_type],
>>>> +                                record.bank,
>>>> +                                record.ts,
>>>> +                                record.offset,
>>>> +                                record.mem_channel,
>>>> +                                record.mcumc_id,
>>>> +                                record.retired_page);
>>>> +
>>>> +                       data_len = min_t(size_t, rec_hdr_fmt_size - r, size);
>>>> +                       if (copy_to_user(buf, &data[r], data_len))
>>>> +                               return -EINVAL;
>>>> +                       buf += data_len;
>>>> +                       size -= data_len;
>>>> +                       *pos += data_len;
>>>> +                       r = 0;
>>>> +               }
>>>> +       }
>>>> +       res = 0;
>>>> +Out:
>>>> +       mutex_unlock(&control->ras_tbl_mutex);
>>>> +       return res < 0 ? res : orig_size - size;
>>>> +}
>>>> +
>>>> +static ssize_t
>>>> +amdgpu_ras_debugfs_eeprom_table_read(struct file *f, char __user *buf,
>>>> +                                    size_t size, loff_t *pos)
>>>> +{
>>>> +       struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private;
>>>> +       struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
>>>> +       struct amdgpu_ras_eeprom_control *control = ras ? &ras->eeprom_control : NULL;
>>>> +       u8 data[81];
>>>> +       int res;
>>>> +
>>>> +       if (!size)
>>>> +               return size;
>>>> +
>>>> +       if (!ras || !control) {
>>>> +               res = snprintf(data, sizeof(data), "Not supported\n");
>>>> +               if (*pos >= res)
>>>> +                       return 0;
>>>> +
>>>> +               res -= *pos;
>>>> +               res = min_t(size_t, res, size);
>>>> +
>>>> +               if (copy_to_user(buf, &data[*pos], res))
>>>> +                       return -EINVAL;
>>>> +
>>>> +               *pos += res;
>>>> +
>>>> +               return res;
>>>> +       } else {
>>>> +               return amdgpu_ras_debugfs_table_read(f, buf, size, pos);
>>>> +       }
>>>> +}
>>>> +
>>>> +const struct file_operations amdgpu_ras_debugfs_eeprom_table_ops = {
>>>> +       .owner = THIS_MODULE,
>>>> +       .read = amdgpu_ras_debugfs_eeprom_table_read,
>>>> +       .write = NULL,
>>>> +       .llseek = default_llseek,
>>>> +};
>>>> +
>>>>  /**
>>>>   * __verify_ras_table_checksum -- verify the RAS EEPROM table checksum
>>>>   * @control: pointer to control structure
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
>>>> index edb0195ea2eb8c..430e08ab3313a2 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
>>>> @@ -29,9 +29,10 @@
>>>>  struct amdgpu_device;
>>>>
>>>>  enum amdgpu_ras_eeprom_err_type {
>>>> -       AMDGPU_RAS_EEPROM_ERR_PLACE_HOLDER,
>>>> +       AMDGPU_RAS_EEPROM_ERR_NA,
>>>>         AMDGPU_RAS_EEPROM_ERR_RECOVERABLE,
>>>> -       AMDGPU_RAS_EEPROM_ERR_NON_RECOVERABLE
>>>> +       AMDGPU_RAS_EEPROM_ERR_NON_RECOVERABLE,
>>>> +       AMDGPU_RAS_EEPROM_ERR_COUNT,
>>>>  };
>>>>
>>>>  struct amdgpu_ras_eeprom_table_header {
>>>> @@ -121,4 +122,9 @@ int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
>>>>
>>>>  inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
>>>>
>>>> +void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control *control);
>>>> +
>>>> +extern const struct file_operations amdgpu_ras_debugfs_eeprom_size_ops;
>>>> +extern const struct file_operations amdgpu_ras_debugfs_eeprom_table_ops;
>>>> +
>>>>  #endif // _AMDGPU_RAS_EEPROM_H
>>>> --
>>>> 2.32.0
>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cluben.tuikov%40amd.com%7C522f2150d3a94b6f713308d92d018213%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637590306808868534%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=1FpDKAaXISnv0NjyVvit4%2FIHxSU7bev%2FpLEYuImOin4%3D&amp;reserved=0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2021-06-11 18:06 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-08 21:39 [PATCH 00/40] I2C fixes Luben Tuikov
2021-06-08 21:39 ` [PATCH 01/40] drm/amdgpu: add a mutex for the smu11 i2c bus (v2) Luben Tuikov
2021-06-08 21:39 ` [PATCH 02/40] drm/amdgpu/pm: rework i2c xfers on sienna cichlid (v3) Luben Tuikov
2021-06-08 21:39 ` [PATCH 03/40] drm/amdgpu/pm: rework i2c xfers on arcturus (v3) Luben Tuikov
2021-06-08 21:39 ` [PATCH 04/40] drm/amdgpu/pm: add smu i2c implementation for navi1x (v3) Luben Tuikov
2021-06-08 21:39 ` [PATCH 05/40] drm/amdgpu: add new helper for handling EEPROM i2c transfers Luben Tuikov
2021-06-08 21:39 ` [PATCH 06/40] drm/amdgpu/ras: switch ras eeprom handling to use generic helper Luben Tuikov
2021-06-08 21:39 ` [PATCH 07/40] drm/amdgpu/ras: switch fru eeprom handling to use generic helper (v2) Luben Tuikov
2021-06-08 21:39 ` [PATCH 08/40] drm/amdgpu: i2c subsystem uses 7 bit addresses Luben Tuikov
2021-06-08 21:39 ` [PATCH 09/40] drm/amdgpu: add I2C_CLASS_HWMON to SMU i2c buses Luben Tuikov
2021-06-08 21:39 ` [PATCH 10/40] drm/amdgpu: rework smu11 i2c for generic operation Luben Tuikov
2021-06-08 21:39 ` [PATCH 11/40] drm/amdgpu: only set restart on first cmd of the smu i2c transaction Luben Tuikov
2021-06-08 21:39 ` [PATCH 12/40] drm/amdgpu: Remember to wait 10ms for write buffer flush v2 Luben Tuikov
2021-06-08 21:39 ` [PATCH 13/40] dmr/amdgpu: Add RESTART handling also to smu_v11_0_i2c (VG20) Luben Tuikov
2021-06-10 20:18   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 14/40] drm/amdgpu: Drop i > 0 restriction for issuing RESTART Luben Tuikov
2021-06-10 20:21   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 15/40] drm/amdgpu: Send STOP for the last byte of msg only Luben Tuikov
2021-06-10 20:22   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 16/40] drm/amd/pm: SMU I2C: Return number of messages processed Luben Tuikov
2021-06-10 20:25   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 17/40] drm/amdgpu/pm: ADD I2C quirk adapter table Luben Tuikov
2021-06-10 20:26   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 18/40] drm/amdgpu: Fix Vega20 I2C to be agnostic (v2) Luben Tuikov
2021-06-10 20:43   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 19/40] drm/amdgpu: Fixes to the AMDGPU EEPROM driver Luben Tuikov
2021-06-10 20:53   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 20/40] drm/amdgpu: EEPROM respects I2C quirks Luben Tuikov
2021-06-11 17:01   ` Alex Deucher
2021-06-11 17:17     ` Luben Tuikov
2021-06-11 17:37       ` Luben Tuikov
2021-06-08 21:39 ` [PATCH 21/40] drm/amdgpu: I2C EEPROM full memory addressing Luben Tuikov
2021-06-10 20:57   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 22/40] drm/amdgpu: RAS and FRU now use 19-bit I2C address Luben Tuikov
2021-06-10 20:59   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 23/40] drm/amdgpu: Fix wrap-around bugs in RAS Luben Tuikov
2021-06-10 21:00   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 24/40] drm/amdgpu: I2C class is HWMON Luben Tuikov
2021-06-10 21:02   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 25/40] drm/amdgpu: RAS: EEPROM --> RAS Luben Tuikov
2021-06-10 21:03   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 26/40] drm/amdgpu: Rename misspelled function Luben Tuikov
2021-06-10 21:04   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 27/40] drm/amdgpu: RAS xfer to read/write Luben Tuikov
2021-06-10 21:05   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 28/40] drm/amdgpu: EEPROM: add explicit read and write Luben Tuikov
2021-06-10 21:06   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 29/40] drm/amd/pm: Extend the I2C quirk table Luben Tuikov
2021-06-10 21:07   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 30/40] drm/amd/pm: Simplify managed I2C transfer functions Luben Tuikov
2021-06-10 21:08   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 31/40] drm/amdgpu: Fix width of I2C address Luben Tuikov
2021-06-10 21:09   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 32/40] drm/amdgpu: Return result fix in RAS Luben Tuikov
2021-06-10 21:11   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 33/40] drm/amd/pm: Fix a bug in i2c_xfer Luben Tuikov
2021-06-10 21:12   ` Alex Deucher
2021-06-10 22:26     ` Luben Tuikov
2021-06-08 21:39 ` [PATCH 34/40] drm/amdgpu: Fix amdgpu_ras_eeprom_init() Luben Tuikov
2021-06-10 21:12   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 35/40] drm/amdgpu: Simplify RAS EEPROM checksum calculations Luben Tuikov
2021-06-11 17:07   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 36/40] drm/amdgpu: Use explicit cardinality for clarity Luben Tuikov
2021-06-10 21:17   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 37/40] drm/amdgpu: Optimizations to EEPROM RAS table I/O Luben Tuikov
2021-06-08 21:39 ` [PATCH 38/40] drm/amdgpu: RAS EEPROM table is now in debugfs Luben Tuikov
2021-06-11 17:16   ` Alex Deucher
2021-06-11 17:30     ` Luben Tuikov
2021-06-11 17:51       ` Alex Deucher
2021-06-11 18:06         ` Luben Tuikov
2021-06-08 21:39 ` [PATCH 39/40] drm/amdgpu: Fix koops when accessing RAS EEPROM Luben Tuikov
2021-06-10 21:23   ` Alex Deucher
2021-06-08 21:39 ` [PATCH 40/40] drm/amdgpu: Use a single loop Luben Tuikov
2021-06-10 21:25   ` Alex Deucher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.