All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/4] ice: lower CPU usage with GNSS
@ 2023-04-01 17:26 ` Michal Schmidt
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-01 17:26 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, Jesse Brandeburg, Tony Nguyen, Michal Michalik,
	Arkadiusz Kubalewski, Karol Kolacinski, Petr Oros

This series lowers the CPU usage of the ice driver when using its
provided /dev/gnss*.

Intel engineers, in addition to reviewing the patches for correctness,
please also consider my doubts expressed in the descriptions of patches
1 and 2. There may be better solutions possible.

Michal Schmidt (4):
  ice: lower CPU usage of the GNSS read thread
  ice: sleep, don't busy-wait, for sq_cmd_timeout
  ice: remove unused buffer copy code in ice_sq_send_cmd_retry()
  ice: sleep, don't busy-wait, in the SQ send retry loop

 drivers/net/ethernet/intel/ice/ice_common.c   | 29 +++++--------
 drivers/net/ethernet/intel/ice/ice_controlq.c |  8 ++--
 drivers/net/ethernet/intel/ice/ice_controlq.h |  2 +-
 drivers/net/ethernet/intel/ice/ice_gnss.c     | 42 +++++++++----------
 drivers/net/ethernet/intel/ice/ice_gnss.h     |  3 +-
 5 files changed, 35 insertions(+), 49 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Intel-wired-lan] [PATCH net-next 0/4] ice: lower CPU usage with GNSS
@ 2023-04-01 17:26 ` Michal Schmidt
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-01 17:26 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: netdev, Jesse Brandeburg, Karol Kolacinski, Tony Nguyen

This series lowers the CPU usage of the ice driver when using its
provided /dev/gnss*.

Intel engineers, in addition to reviewing the patches for correctness,
please also consider my doubts expressed in the descriptions of patches
1 and 2. There may be better solutions possible.

Michal Schmidt (4):
  ice: lower CPU usage of the GNSS read thread
  ice: sleep, don't busy-wait, for sq_cmd_timeout
  ice: remove unused buffer copy code in ice_sq_send_cmd_retry()
  ice: sleep, don't busy-wait, in the SQ send retry loop

 drivers/net/ethernet/intel/ice/ice_common.c   | 29 +++++--------
 drivers/net/ethernet/intel/ice/ice_controlq.c |  8 ++--
 drivers/net/ethernet/intel/ice/ice_controlq.h |  2 +-
 drivers/net/ethernet/intel/ice/ice_gnss.c     | 42 +++++++++----------
 drivers/net/ethernet/intel/ice/ice_gnss.h     |  3 +-
 5 files changed, 35 insertions(+), 49 deletions(-)

-- 
2.39.2

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH net-next 1/4] ice: lower CPU usage of the GNSS read thread
  2023-04-01 17:26 ` [Intel-wired-lan] " Michal Schmidt
@ 2023-04-01 17:26   ` Michal Schmidt
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-01 17:26 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, Jesse Brandeburg, Tony Nguyen, Michal Michalik,
	Arkadiusz Kubalewski, Karol Kolacinski, Petr Oros

The ice-gnss-<dev_name> kernel thread, which reads data from the u-blox
GNSS module, keep a CPU core almost 100% busy. The main reason is that
it busy-waits for data to become available.

A simple improvement would be to replace the "mdelay(10);" in
ice_gnss_read() with sleeping. A better fix is to not do any waiting
directly in the function and just requeue this delayed work as needed.
The advantage is that canceling the work from ice_gnss_exit() becomes
immediate, rather than taking up to ~2.5 seconds (ICE_MAX_UBX_READ_TRIES
* 10 ms).

This lowers the CPU usage of the ice-gnss-<dev_name> thread on my system
from ~90 % to ~8 %.

I am not sure if the larger 0.1 s pause after inserting data into the
gnss subsystem is really necessary, but I'm keeping that as it was.

Of course, ideally the driver would not have to poll at all, but I don't
know if the E810 can watch for GNSS data availability over the i2c bus
by itself and notify the driver.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_gnss.c | 42 ++++++++++-------------
 drivers/net/ethernet/intel/ice/ice_gnss.h |  3 +-
 2 files changed, 20 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_gnss.c b/drivers/net/ethernet/intel/ice/ice_gnss.c
index 8dec748bb53a..2ea8a2b11bcd 100644
--- a/drivers/net/ethernet/intel/ice/ice_gnss.c
+++ b/drivers/net/ethernet/intel/ice/ice_gnss.c
@@ -117,6 +117,7 @@ static void ice_gnss_read(struct kthread_work *work)
 {
 	struct gnss_serial *gnss = container_of(work, struct gnss_serial,
 						read_work.work);
+	unsigned long delay = ICE_GNSS_POLL_DATA_DELAY_TIME;
 	unsigned int i, bytes_read, data_len, count;
 	struct ice_aqc_link_topo_addr link_topo;
 	struct ice_pf *pf;
@@ -136,11 +137,6 @@ static void ice_gnss_read(struct kthread_work *work)
 		return;
 
 	hw = &pf->hw;
-	buf = (char *)get_zeroed_page(GFP_KERNEL);
-	if (!buf) {
-		err = -ENOMEM;
-		goto exit;
-	}
 
 	memset(&link_topo, 0, sizeof(struct ice_aqc_link_topo_addr));
 	link_topo.topo_params.index = ICE_E810T_GNSS_I2C_BUS;
@@ -151,25 +147,24 @@ static void ice_gnss_read(struct kthread_work *work)
 	i2c_params = ICE_GNSS_UBX_DATA_LEN_WIDTH |
 		     ICE_AQC_I2C_USE_REPEATED_START;
 
-	/* Read data length in a loop, when it's not 0 the data is ready */
-	for (i = 0; i < ICE_MAX_UBX_READ_TRIES; i++) {
-		err = ice_aq_read_i2c(hw, link_topo, ICE_GNSS_UBX_I2C_BUS_ADDR,
-				      cpu_to_le16(ICE_GNSS_UBX_DATA_LEN_H),
-				      i2c_params, (u8 *)&data_len_b, NULL);
-		if (err)
-			goto exit_buf;
+	err = ice_aq_read_i2c(hw, link_topo, ICE_GNSS_UBX_I2C_BUS_ADDR,
+			      cpu_to_le16(ICE_GNSS_UBX_DATA_LEN_H),
+			      i2c_params, (u8 *)&data_len_b, NULL);
+	if (err)
+		goto requeue;
 
-		data_len = be16_to_cpu(data_len_b);
-		if (data_len != 0 && data_len != U16_MAX)
-			break;
+	data_len = be16_to_cpu(data_len_b);
+	if (data_len == 0 || data_len == U16_MAX)
+		goto requeue;
 
-		mdelay(10);
-	}
+	/* The u-blox has data_len bytes for us to read */
 
 	data_len = min_t(typeof(data_len), data_len, PAGE_SIZE);
-	if (!data_len) {
+
+	buf = (char *)get_zeroed_page(GFP_KERNEL);
+	if (!buf) {
 		err = -ENOMEM;
-		goto exit_buf;
+		goto requeue;
 	}
 
 	/* Read received data */
@@ -183,7 +178,7 @@ static void ice_gnss_read(struct kthread_work *work)
 				      cpu_to_le16(ICE_GNSS_UBX_EMPTY_DATA),
 				      bytes_read, &buf[i], NULL);
 		if (err)
-			goto exit_buf;
+			goto free_buf;
 	}
 
 	count = gnss_insert_raw(pf->gnss_dev, buf, i);
@@ -191,10 +186,11 @@ static void ice_gnss_read(struct kthread_work *work)
 		dev_warn(ice_pf_to_dev(pf),
 			 "gnss_insert_raw ret=%d size=%d\n",
 			 count, i);
-exit_buf:
+	delay = ICE_GNSS_TIMER_DELAY_TIME;
+free_buf:
 	free_page((unsigned long)buf);
-	kthread_queue_delayed_work(gnss->kworker, &gnss->read_work,
-				   ICE_GNSS_TIMER_DELAY_TIME);
+requeue:
+	kthread_queue_delayed_work(gnss->kworker, &gnss->read_work, delay);
 exit:
 	if (err)
 		dev_dbg(ice_pf_to_dev(pf), "GNSS failed to read err=%d\n", err);
diff --git a/drivers/net/ethernet/intel/ice/ice_gnss.h b/drivers/net/ethernet/intel/ice/ice_gnss.h
index 4d49e5b0b4b8..640df7411373 100644
--- a/drivers/net/ethernet/intel/ice/ice_gnss.h
+++ b/drivers/net/ethernet/intel/ice/ice_gnss.h
@@ -5,6 +5,7 @@
 #define _ICE_GNSS_H_
 
 #define ICE_E810T_GNSS_I2C_BUS		0x2
+#define ICE_GNSS_POLL_DATA_DELAY_TIME	(HZ / 100) /* poll every 10 ms */
 #define ICE_GNSS_TIMER_DELAY_TIME	(HZ / 10) /* 0.1 second per message */
 #define ICE_GNSS_TTY_WRITE_BUF		250
 #define ICE_MAX_I2C_DATA_SIZE		FIELD_MAX(ICE_AQC_I2C_DATA_SIZE_M)
@@ -20,8 +21,6 @@
  * passed as I2C addr parameter.
  */
 #define ICE_GNSS_UBX_WRITE_BYTES	(ICE_MAX_I2C_WRITE_BYTES + 1)
-#define ICE_MAX_UBX_READ_TRIES		255
-#define ICE_MAX_UBX_ACK_READ_TRIES	4095
 
 struct gnss_write_buf {
 	struct list_head queue;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Intel-wired-lan] [PATCH net-next 1/4] ice: lower CPU usage of the GNSS read thread
@ 2023-04-01 17:26   ` Michal Schmidt
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-01 17:26 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: netdev, Jesse Brandeburg, Karol Kolacinski, Tony Nguyen

The ice-gnss-<dev_name> kernel thread, which reads data from the u-blox
GNSS module, keep a CPU core almost 100% busy. The main reason is that
it busy-waits for data to become available.

A simple improvement would be to replace the "mdelay(10);" in
ice_gnss_read() with sleeping. A better fix is to not do any waiting
directly in the function and just requeue this delayed work as needed.
The advantage is that canceling the work from ice_gnss_exit() becomes
immediate, rather than taking up to ~2.5 seconds (ICE_MAX_UBX_READ_TRIES
* 10 ms).

This lowers the CPU usage of the ice-gnss-<dev_name> thread on my system
from ~90 % to ~8 %.

I am not sure if the larger 0.1 s pause after inserting data into the
gnss subsystem is really necessary, but I'm keeping that as it was.

Of course, ideally the driver would not have to poll at all, but I don't
know if the E810 can watch for GNSS data availability over the i2c bus
by itself and notify the driver.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_gnss.c | 42 ++++++++++-------------
 drivers/net/ethernet/intel/ice/ice_gnss.h |  3 +-
 2 files changed, 20 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_gnss.c b/drivers/net/ethernet/intel/ice/ice_gnss.c
index 8dec748bb53a..2ea8a2b11bcd 100644
--- a/drivers/net/ethernet/intel/ice/ice_gnss.c
+++ b/drivers/net/ethernet/intel/ice/ice_gnss.c
@@ -117,6 +117,7 @@ static void ice_gnss_read(struct kthread_work *work)
 {
 	struct gnss_serial *gnss = container_of(work, struct gnss_serial,
 						read_work.work);
+	unsigned long delay = ICE_GNSS_POLL_DATA_DELAY_TIME;
 	unsigned int i, bytes_read, data_len, count;
 	struct ice_aqc_link_topo_addr link_topo;
 	struct ice_pf *pf;
@@ -136,11 +137,6 @@ static void ice_gnss_read(struct kthread_work *work)
 		return;
 
 	hw = &pf->hw;
-	buf = (char *)get_zeroed_page(GFP_KERNEL);
-	if (!buf) {
-		err = -ENOMEM;
-		goto exit;
-	}
 
 	memset(&link_topo, 0, sizeof(struct ice_aqc_link_topo_addr));
 	link_topo.topo_params.index = ICE_E810T_GNSS_I2C_BUS;
@@ -151,25 +147,24 @@ static void ice_gnss_read(struct kthread_work *work)
 	i2c_params = ICE_GNSS_UBX_DATA_LEN_WIDTH |
 		     ICE_AQC_I2C_USE_REPEATED_START;
 
-	/* Read data length in a loop, when it's not 0 the data is ready */
-	for (i = 0; i < ICE_MAX_UBX_READ_TRIES; i++) {
-		err = ice_aq_read_i2c(hw, link_topo, ICE_GNSS_UBX_I2C_BUS_ADDR,
-				      cpu_to_le16(ICE_GNSS_UBX_DATA_LEN_H),
-				      i2c_params, (u8 *)&data_len_b, NULL);
-		if (err)
-			goto exit_buf;
+	err = ice_aq_read_i2c(hw, link_topo, ICE_GNSS_UBX_I2C_BUS_ADDR,
+			      cpu_to_le16(ICE_GNSS_UBX_DATA_LEN_H),
+			      i2c_params, (u8 *)&data_len_b, NULL);
+	if (err)
+		goto requeue;
 
-		data_len = be16_to_cpu(data_len_b);
-		if (data_len != 0 && data_len != U16_MAX)
-			break;
+	data_len = be16_to_cpu(data_len_b);
+	if (data_len == 0 || data_len == U16_MAX)
+		goto requeue;
 
-		mdelay(10);
-	}
+	/* The u-blox has data_len bytes for us to read */
 
 	data_len = min_t(typeof(data_len), data_len, PAGE_SIZE);
-	if (!data_len) {
+
+	buf = (char *)get_zeroed_page(GFP_KERNEL);
+	if (!buf) {
 		err = -ENOMEM;
-		goto exit_buf;
+		goto requeue;
 	}
 
 	/* Read received data */
@@ -183,7 +178,7 @@ static void ice_gnss_read(struct kthread_work *work)
 				      cpu_to_le16(ICE_GNSS_UBX_EMPTY_DATA),
 				      bytes_read, &buf[i], NULL);
 		if (err)
-			goto exit_buf;
+			goto free_buf;
 	}
 
 	count = gnss_insert_raw(pf->gnss_dev, buf, i);
@@ -191,10 +186,11 @@ static void ice_gnss_read(struct kthread_work *work)
 		dev_warn(ice_pf_to_dev(pf),
 			 "gnss_insert_raw ret=%d size=%d\n",
 			 count, i);
-exit_buf:
+	delay = ICE_GNSS_TIMER_DELAY_TIME;
+free_buf:
 	free_page((unsigned long)buf);
-	kthread_queue_delayed_work(gnss->kworker, &gnss->read_work,
-				   ICE_GNSS_TIMER_DELAY_TIME);
+requeue:
+	kthread_queue_delayed_work(gnss->kworker, &gnss->read_work, delay);
 exit:
 	if (err)
 		dev_dbg(ice_pf_to_dev(pf), "GNSS failed to read err=%d\n", err);
diff --git a/drivers/net/ethernet/intel/ice/ice_gnss.h b/drivers/net/ethernet/intel/ice/ice_gnss.h
index 4d49e5b0b4b8..640df7411373 100644
--- a/drivers/net/ethernet/intel/ice/ice_gnss.h
+++ b/drivers/net/ethernet/intel/ice/ice_gnss.h
@@ -5,6 +5,7 @@
 #define _ICE_GNSS_H_
 
 #define ICE_E810T_GNSS_I2C_BUS		0x2
+#define ICE_GNSS_POLL_DATA_DELAY_TIME	(HZ / 100) /* poll every 10 ms */
 #define ICE_GNSS_TIMER_DELAY_TIME	(HZ / 10) /* 0.1 second per message */
 #define ICE_GNSS_TTY_WRITE_BUF		250
 #define ICE_MAX_I2C_DATA_SIZE		FIELD_MAX(ICE_AQC_I2C_DATA_SIZE_M)
@@ -20,8 +21,6 @@
  * passed as I2C addr parameter.
  */
 #define ICE_GNSS_UBX_WRITE_BYTES	(ICE_MAX_I2C_WRITE_BYTES + 1)
-#define ICE_MAX_UBX_READ_TRIES		255
-#define ICE_MAX_UBX_ACK_READ_TRIES	4095
 
 struct gnss_write_buf {
 	struct list_head queue;
-- 
2.39.2

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 2/4] ice: sleep, don't busy-wait, for sq_cmd_timeout
  2023-04-01 17:26 ` [Intel-wired-lan] " Michal Schmidt
@ 2023-04-01 17:26   ` Michal Schmidt
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-01 17:26 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, Jesse Brandeburg, Tony Nguyen, Michal Michalik,
	Arkadiusz Kubalewski, Karol Kolacinski, Petr Oros

The driver polls for ice_sq_done() with a 100 µs period for up to 1 s
and it uses udelay to do that.

Let's use usleep_range instead. We know sleeping is allowed here,
because we're holding a mutex (cq->sq_lock). To preserve the total
max waiting time, measure cq->sq_cmd_timeout in jiffies.

The sq_cmd_timeout is referenced also in ice_release_res(), but there
the polling period is 1 ms (i.e. 10 times longer). Since the timeout
was expressed in terms of the number of loops, the total timeout in this
function is 10 s. I do not know if this is intentional. This patch keeps
it.

The patch lowers the CPU usage of the ice-gnss-<dev_name> kernel thread
on my system from ~8 % to less than 1 %.
I saw a report of high CPU usage with ptp4l where the busy-waiting in
ice_sq_send_cmd dominated the profile. The patch should help with that.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_common.c   | 14 +++++++-------
 drivers/net/ethernet/intel/ice/ice_controlq.c |  9 +++++----
 drivers/net/ethernet/intel/ice/ice_controlq.h |  2 +-
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index c2fda4fa4188..14cffe49fa8c 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -1992,19 +1992,19 @@ ice_acquire_res(struct ice_hw *hw, enum ice_aq_res_ids res,
  */
 void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res)
 {
-	u32 total_delay = 0;
+	unsigned long timeout;
 	int status;
 
-	status = ice_aq_release_res(hw, res, 0, NULL);
-
 	/* there are some rare cases when trying to release the resource
 	 * results in an admin queue timeout, so handle them correctly
 	 */
-	while ((status == -EIO) && (total_delay < hw->adminq.sq_cmd_timeout)) {
-		mdelay(1);
+	timeout = jiffies + 10 * hw->adminq.sq_cmd_timeout;
+	do {
 		status = ice_aq_release_res(hw, res, 0, NULL);
-		total_delay++;
-	}
+		if (status != -EIO)
+			break;
+		usleep_range(1000, 2000);
+	} while (time_before(jiffies, timeout));
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.c b/drivers/net/ethernet/intel/ice/ice_controlq.c
index 6bcfee295991..10125e8aa555 100644
--- a/drivers/net/ethernet/intel/ice/ice_controlq.c
+++ b/drivers/net/ethernet/intel/ice/ice_controlq.c
@@ -967,7 +967,7 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq,
 	struct ice_aq_desc *desc_on_ring;
 	bool cmd_completed = false;
 	struct ice_sq_cd *details;
-	u32 total_delay = 0;
+	unsigned long timeout;
 	int status = 0;
 	u16 retval = 0;
 	u32 val = 0;
@@ -1060,13 +1060,14 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq,
 		cq->sq.next_to_use = 0;
 	wr32(hw, cq->sq.tail, cq->sq.next_to_use);
 
+	timeout = jiffies + cq->sq_cmd_timeout;
 	do {
 		if (ice_sq_done(hw, cq))
 			break;
 
-		udelay(ICE_CTL_Q_SQ_CMD_USEC);
-		total_delay++;
-	} while (total_delay < cq->sq_cmd_timeout);
+		usleep_range(ICE_CTL_Q_SQ_CMD_USEC,
+			     ICE_CTL_Q_SQ_CMD_USEC * 3 / 2);
+	} while (time_before(jiffies, timeout));
 
 	/* if ready, copy the desc back to temp */
 	if (ice_sq_done(hw, cq)) {
diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.h b/drivers/net/ethernet/intel/ice/ice_controlq.h
index c07e9cc9fc6e..f2d3b115ae0b 100644
--- a/drivers/net/ethernet/intel/ice/ice_controlq.h
+++ b/drivers/net/ethernet/intel/ice/ice_controlq.h
@@ -34,7 +34,7 @@ enum ice_ctl_q {
 };
 
 /* Control Queue timeout settings - max delay 1s */
-#define ICE_CTL_Q_SQ_CMD_TIMEOUT	10000 /* Count 10000 times */
+#define ICE_CTL_Q_SQ_CMD_TIMEOUT	HZ    /* Wait max 1s */
 #define ICE_CTL_Q_SQ_CMD_USEC		100   /* Check every 100usec */
 #define ICE_CTL_Q_ADMIN_INIT_TIMEOUT	10    /* Count 10 times */
 #define ICE_CTL_Q_ADMIN_INIT_MSEC	100   /* Check every 100msec */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Intel-wired-lan] [PATCH net-next 2/4] ice: sleep, don't busy-wait, for sq_cmd_timeout
@ 2023-04-01 17:26   ` Michal Schmidt
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-01 17:26 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: netdev, Jesse Brandeburg, Karol Kolacinski, Tony Nguyen

The driver polls for ice_sq_done() with a 100 µs period for up to 1 s
and it uses udelay to do that.

Let's use usleep_range instead. We know sleeping is allowed here,
because we're holding a mutex (cq->sq_lock). To preserve the total
max waiting time, measure cq->sq_cmd_timeout in jiffies.

The sq_cmd_timeout is referenced also in ice_release_res(), but there
the polling period is 1 ms (i.e. 10 times longer). Since the timeout
was expressed in terms of the number of loops, the total timeout in this
function is 10 s. I do not know if this is intentional. This patch keeps
it.

The patch lowers the CPU usage of the ice-gnss-<dev_name> kernel thread
on my system from ~8 % to less than 1 %.
I saw a report of high CPU usage with ptp4l where the busy-waiting in
ice_sq_send_cmd dominated the profile. The patch should help with that.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_common.c   | 14 +++++++-------
 drivers/net/ethernet/intel/ice/ice_controlq.c |  9 +++++----
 drivers/net/ethernet/intel/ice/ice_controlq.h |  2 +-
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index c2fda4fa4188..14cffe49fa8c 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -1992,19 +1992,19 @@ ice_acquire_res(struct ice_hw *hw, enum ice_aq_res_ids res,
  */
 void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res)
 {
-	u32 total_delay = 0;
+	unsigned long timeout;
 	int status;
 
-	status = ice_aq_release_res(hw, res, 0, NULL);
-
 	/* there are some rare cases when trying to release the resource
 	 * results in an admin queue timeout, so handle them correctly
 	 */
-	while ((status == -EIO) && (total_delay < hw->adminq.sq_cmd_timeout)) {
-		mdelay(1);
+	timeout = jiffies + 10 * hw->adminq.sq_cmd_timeout;
+	do {
 		status = ice_aq_release_res(hw, res, 0, NULL);
-		total_delay++;
-	}
+		if (status != -EIO)
+			break;
+		usleep_range(1000, 2000);
+	} while (time_before(jiffies, timeout));
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.c b/drivers/net/ethernet/intel/ice/ice_controlq.c
index 6bcfee295991..10125e8aa555 100644
--- a/drivers/net/ethernet/intel/ice/ice_controlq.c
+++ b/drivers/net/ethernet/intel/ice/ice_controlq.c
@@ -967,7 +967,7 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq,
 	struct ice_aq_desc *desc_on_ring;
 	bool cmd_completed = false;
 	struct ice_sq_cd *details;
-	u32 total_delay = 0;
+	unsigned long timeout;
 	int status = 0;
 	u16 retval = 0;
 	u32 val = 0;
@@ -1060,13 +1060,14 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq,
 		cq->sq.next_to_use = 0;
 	wr32(hw, cq->sq.tail, cq->sq.next_to_use);
 
+	timeout = jiffies + cq->sq_cmd_timeout;
 	do {
 		if (ice_sq_done(hw, cq))
 			break;
 
-		udelay(ICE_CTL_Q_SQ_CMD_USEC);
-		total_delay++;
-	} while (total_delay < cq->sq_cmd_timeout);
+		usleep_range(ICE_CTL_Q_SQ_CMD_USEC,
+			     ICE_CTL_Q_SQ_CMD_USEC * 3 / 2);
+	} while (time_before(jiffies, timeout));
 
 	/* if ready, copy the desc back to temp */
 	if (ice_sq_done(hw, cq)) {
diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.h b/drivers/net/ethernet/intel/ice/ice_controlq.h
index c07e9cc9fc6e..f2d3b115ae0b 100644
--- a/drivers/net/ethernet/intel/ice/ice_controlq.h
+++ b/drivers/net/ethernet/intel/ice/ice_controlq.h
@@ -34,7 +34,7 @@ enum ice_ctl_q {
 };
 
 /* Control Queue timeout settings - max delay 1s */
-#define ICE_CTL_Q_SQ_CMD_TIMEOUT	10000 /* Count 10000 times */
+#define ICE_CTL_Q_SQ_CMD_TIMEOUT	HZ    /* Wait max 1s */
 #define ICE_CTL_Q_SQ_CMD_USEC		100   /* Check every 100usec */
 #define ICE_CTL_Q_ADMIN_INIT_TIMEOUT	10    /* Count 10 times */
 #define ICE_CTL_Q_ADMIN_INIT_MSEC	100   /* Check every 100msec */
-- 
2.39.2

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 3/4] ice: remove unused buffer copy code in ice_sq_send_cmd_retry()
  2023-04-01 17:26 ` [Intel-wired-lan] " Michal Schmidt
@ 2023-04-01 17:26   ` Michal Schmidt
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-01 17:26 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, Jesse Brandeburg, Tony Nguyen, Michal Michalik,
	Arkadiusz Kubalewski, Karol Kolacinski, Petr Oros

The 'buf_cpy'-related code in ice_sq_send_cmd_retry() looks broken.
'buf' is nowhere copied into 'buf_cpy'.

The reason this does not cause problems is that all commands for which
'is_cmd_for_retry' is true go with a NULL buf.

Let's remove 'buf_cpy'. Add a WARN_ON in case the assumption no longer
holds in the future.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_common.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 14cffe49fa8c..539b756f227c 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -1619,7 +1619,6 @@ ice_sq_send_cmd_retry(struct ice_hw *hw, struct ice_ctl_q_info *cq,
 {
 	struct ice_aq_desc desc_cpy;
 	bool is_cmd_for_retry;
-	u8 *buf_cpy = NULL;
 	u8 idx = 0;
 	u16 opcode;
 	int status;
@@ -1629,11 +1628,8 @@ ice_sq_send_cmd_retry(struct ice_hw *hw, struct ice_ctl_q_info *cq,
 	memset(&desc_cpy, 0, sizeof(desc_cpy));
 
 	if (is_cmd_for_retry) {
-		if (buf) {
-			buf_cpy = kzalloc(buf_size, GFP_KERNEL);
-			if (!buf_cpy)
-				return -ENOMEM;
-		}
+		/* All retryable cmds are direct, without buf. */
+		WARN_ON(buf);
 
 		memcpy(&desc_cpy, desc, sizeof(desc_cpy));
 	}
@@ -1645,17 +1641,12 @@ ice_sq_send_cmd_retry(struct ice_hw *hw, struct ice_ctl_q_info *cq,
 		    hw->adminq.sq_last_status != ICE_AQ_RC_EBUSY)
 			break;
 
-		if (buf_cpy)
-			memcpy(buf, buf_cpy, buf_size);
-
 		memcpy(desc, &desc_cpy, sizeof(desc_cpy));
 
 		mdelay(ICE_SQ_SEND_DELAY_TIME_MS);
 
 	} while (++idx < ICE_SQ_SEND_MAX_EXECUTE);
 
-	kfree(buf_cpy);
-
 	return status;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Intel-wired-lan] [PATCH net-next 3/4] ice: remove unused buffer copy code in ice_sq_send_cmd_retry()
@ 2023-04-01 17:26   ` Michal Schmidt
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-01 17:26 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: netdev, Jesse Brandeburg, Karol Kolacinski, Tony Nguyen

The 'buf_cpy'-related code in ice_sq_send_cmd_retry() looks broken.
'buf' is nowhere copied into 'buf_cpy'.

The reason this does not cause problems is that all commands for which
'is_cmd_for_retry' is true go with a NULL buf.

Let's remove 'buf_cpy'. Add a WARN_ON in case the assumption no longer
holds in the future.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_common.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 14cffe49fa8c..539b756f227c 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -1619,7 +1619,6 @@ ice_sq_send_cmd_retry(struct ice_hw *hw, struct ice_ctl_q_info *cq,
 {
 	struct ice_aq_desc desc_cpy;
 	bool is_cmd_for_retry;
-	u8 *buf_cpy = NULL;
 	u8 idx = 0;
 	u16 opcode;
 	int status;
@@ -1629,11 +1628,8 @@ ice_sq_send_cmd_retry(struct ice_hw *hw, struct ice_ctl_q_info *cq,
 	memset(&desc_cpy, 0, sizeof(desc_cpy));
 
 	if (is_cmd_for_retry) {
-		if (buf) {
-			buf_cpy = kzalloc(buf_size, GFP_KERNEL);
-			if (!buf_cpy)
-				return -ENOMEM;
-		}
+		/* All retryable cmds are direct, without buf. */
+		WARN_ON(buf);
 
 		memcpy(&desc_cpy, desc, sizeof(desc_cpy));
 	}
@@ -1645,17 +1641,12 @@ ice_sq_send_cmd_retry(struct ice_hw *hw, struct ice_ctl_q_info *cq,
 		    hw->adminq.sq_last_status != ICE_AQ_RC_EBUSY)
 			break;
 
-		if (buf_cpy)
-			memcpy(buf, buf_cpy, buf_size);
-
 		memcpy(desc, &desc_cpy, sizeof(desc_cpy));
 
 		mdelay(ICE_SQ_SEND_DELAY_TIME_MS);
 
 	} while (++idx < ICE_SQ_SEND_MAX_EXECUTE);
 
-	kfree(buf_cpy);
-
 	return status;
 }
 
-- 
2.39.2

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 4/4] ice: sleep, don't busy-wait, in the SQ send retry loop
  2023-04-01 17:26 ` [Intel-wired-lan] " Michal Schmidt
@ 2023-04-01 17:26   ` Michal Schmidt
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-01 17:26 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, Jesse Brandeburg, Tony Nguyen, Michal Michalik,
	Arkadiusz Kubalewski, Karol Kolacinski, Petr Oros

10 ms is a lot of time to spend busy-waiting. Sleeping is clearly
allowed here, because we have just returned from ice_sq_send_cmd(),
which takes a mutex.

On kernels with HZ=100, this msleep may be twice as long, but I don't
think it matters.
I did not actually observe any retries happening here.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 539b756f227c..438367322bcd 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -1643,7 +1643,7 @@ ice_sq_send_cmd_retry(struct ice_hw *hw, struct ice_ctl_q_info *cq,
 
 		memcpy(desc, &desc_cpy, sizeof(desc_cpy));
 
-		mdelay(ICE_SQ_SEND_DELAY_TIME_MS);
+		msleep(ICE_SQ_SEND_DELAY_TIME_MS);
 
 	} while (++idx < ICE_SQ_SEND_MAX_EXECUTE);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Intel-wired-lan] [PATCH net-next 4/4] ice: sleep, don't busy-wait, in the SQ send retry loop
@ 2023-04-01 17:26   ` Michal Schmidt
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-01 17:26 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: netdev, Jesse Brandeburg, Karol Kolacinski, Tony Nguyen

10 ms is a lot of time to spend busy-waiting. Sleeping is clearly
allowed here, because we have just returned from ice_sq_send_cmd(),
which takes a mutex.

On kernels with HZ=100, this msleep may be twice as long, but I don't
think it matters.
I did not actually observe any retries happening here.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 539b756f227c..438367322bcd 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -1643,7 +1643,7 @@ ice_sq_send_cmd_retry(struct ice_hw *hw, struct ice_ctl_q_info *cq,
 
 		memcpy(desc, &desc_cpy, sizeof(desc_cpy));
 
-		mdelay(ICE_SQ_SEND_DELAY_TIME_MS);
+		msleep(ICE_SQ_SEND_DELAY_TIME_MS);
 
 	} while (++idx < ICE_SQ_SEND_MAX_EXECUTE);
 
-- 
2.39.2

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 1/4] ice: lower CPU usage of the GNSS read thread
  2023-04-01 17:26   ` [Intel-wired-lan] " Michal Schmidt
@ 2023-04-01 18:29     ` Andrew Lunn
  -1 siblings, 0 replies; 20+ messages in thread
From: Andrew Lunn @ 2023-04-01 18:29 UTC (permalink / raw)
  To: Michal Schmidt
  Cc: intel-wired-lan, netdev, Jesse Brandeburg, Tony Nguyen,
	Michal Michalik, Arkadiusz Kubalewski, Karol Kolacinski,
	Petr Oros

On Sat, Apr 01, 2023 at 07:26:56PM +0200, Michal Schmidt wrote:
> The ice-gnss-<dev_name> kernel thread, which reads data from the u-blox
> GNSS module, keep a CPU core almost 100% busy. The main reason is that
> it busy-waits for data to become available.

Hi Michal

Please could you change the patch subject. Maybe something like "Do
not busy wait in read" That gives a better idea what the patch does.

    Andrew

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next 1/4] ice: lower CPU usage of the GNSS read thread
@ 2023-04-01 18:29     ` Andrew Lunn
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Lunn @ 2023-04-01 18:29 UTC (permalink / raw)
  To: Michal Schmidt
  Cc: netdev, Jesse Brandeburg, Karol Kolacinski, Tony Nguyen, intel-wired-lan

On Sat, Apr 01, 2023 at 07:26:56PM +0200, Michal Schmidt wrote:
> The ice-gnss-<dev_name> kernel thread, which reads data from the u-blox
> GNSS module, keep a CPU core almost 100% busy. The main reason is that
> it busy-waits for data to become available.

Hi Michal

Please could you change the patch subject. Maybe something like "Do
not busy wait in read" That gives a better idea what the patch does.

    Andrew
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 2/4] ice: sleep, don't busy-wait, for sq_cmd_timeout
  2023-04-01 17:26   ` [Intel-wired-lan] " Michal Schmidt
@ 2023-04-02 11:18     ` Simon Horman
  -1 siblings, 0 replies; 20+ messages in thread
From: Simon Horman @ 2023-04-02 11:18 UTC (permalink / raw)
  To: Michal Schmidt
  Cc: intel-wired-lan, netdev, Jesse Brandeburg, Tony Nguyen,
	Michal Michalik, Arkadiusz Kubalewski, Karol Kolacinski,
	Petr Oros

On Sat, Apr 01, 2023 at 07:26:57PM +0200, Michal Schmidt wrote:
> The driver polls for ice_sq_done() with a 100 µs period for up to 1 s
> and it uses udelay to do that.
> 
> Let's use usleep_range instead. We know sleeping is allowed here,
> because we're holding a mutex (cq->sq_lock). To preserve the total
> max waiting time, measure cq->sq_cmd_timeout in jiffies.
> 
> The sq_cmd_timeout is referenced also in ice_release_res(), but there
> the polling period is 1 ms (i.e. 10 times longer). Since the timeout
> was expressed in terms of the number of loops, the total timeout in this
> function is 10 s. I do not know if this is intentional. This patch keeps
> it.
> 
> The patch lowers the CPU usage of the ice-gnss-<dev_name> kernel thread
> on my system from ~8 % to less than 1 %.
> I saw a report of high CPU usage with ptp4l where the busy-waiting in
> ice_sq_send_cmd dominated the profile. The patch should help with that.
> 
> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_common.c   | 14 +++++++-------
>  drivers/net/ethernet/intel/ice/ice_controlq.c |  9 +++++----
>  drivers/net/ethernet/intel/ice/ice_controlq.h |  2 +-
>  3 files changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
> index c2fda4fa4188..14cffe49fa8c 100644
> --- a/drivers/net/ethernet/intel/ice/ice_common.c
> +++ b/drivers/net/ethernet/intel/ice/ice_common.c
> @@ -1992,19 +1992,19 @@ ice_acquire_res(struct ice_hw *hw, enum ice_aq_res_ids res,
>   */
>  void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res)
>  {
> -	u32 total_delay = 0;
> +	unsigned long timeout;
>  	int status;
>  
> -	status = ice_aq_release_res(hw, res, 0, NULL);
> -
>  	/* there are some rare cases when trying to release the resource
>  	 * results in an admin queue timeout, so handle them correctly
>  	 */
> -	while ((status == -EIO) && (total_delay < hw->adminq.sq_cmd_timeout)) {
> -		mdelay(1);
> +	timeout = jiffies + 10 * hw->adminq.sq_cmd_timeout;

Not needed for this series. But it occurs to me that a clean-up would be to
use ICE_CTL_Q_SQ_CMD_TIMEOUT directly and remove the sq_cmd_timeout field,
as it seems to be only set to that constant.

> +	do {
>  		status = ice_aq_release_res(hw, res, 0, NULL);
> -		total_delay++;
> -	}
> +		if (status != -EIO)
> +			break;
> +		usleep_range(1000, 2000);
> +	} while (time_before(jiffies, timeout));
>  }
>  
>  /**
> diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.c b/drivers/net/ethernet/intel/ice/ice_controlq.c
> index 6bcfee295991..10125e8aa555 100644
> --- a/drivers/net/ethernet/intel/ice/ice_controlq.c
> +++ b/drivers/net/ethernet/intel/ice/ice_controlq.c
> @@ -967,7 +967,7 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq,
>  	struct ice_aq_desc *desc_on_ring;
>  	bool cmd_completed = false;
>  	struct ice_sq_cd *details;
> -	u32 total_delay = 0;
> +	unsigned long timeout;
>  	int status = 0;
>  	u16 retval = 0;
>  	u32 val = 0;
> @@ -1060,13 +1060,14 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq,
>  		cq->sq.next_to_use = 0;
>  	wr32(hw, cq->sq.tail, cq->sq.next_to_use);
>  
> +	timeout = jiffies + cq->sq_cmd_timeout;
>  	do {
>  		if (ice_sq_done(hw, cq))
>  			break;
>  
> -		udelay(ICE_CTL_Q_SQ_CMD_USEC);
> -		total_delay++;
> -	} while (total_delay < cq->sq_cmd_timeout);
> +		usleep_range(ICE_CTL_Q_SQ_CMD_USEC,
> +			     ICE_CTL_Q_SQ_CMD_USEC * 3 / 2);
> +	} while (time_before(jiffies, timeout));
>  
>  	/* if ready, copy the desc back to temp */
>  	if (ice_sq_done(hw, cq)) {
> diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.h b/drivers/net/ethernet/intel/ice/ice_controlq.h
> index c07e9cc9fc6e..f2d3b115ae0b 100644
> --- a/drivers/net/ethernet/intel/ice/ice_controlq.h
> +++ b/drivers/net/ethernet/intel/ice/ice_controlq.h
> @@ -34,7 +34,7 @@ enum ice_ctl_q {
>  };
>  
>  /* Control Queue timeout settings - max delay 1s */
> -#define ICE_CTL_Q_SQ_CMD_TIMEOUT	10000 /* Count 10000 times */
> +#define ICE_CTL_Q_SQ_CMD_TIMEOUT	HZ    /* Wait max 1s */
>  #define ICE_CTL_Q_SQ_CMD_USEC		100   /* Check every 100usec */
>  #define ICE_CTL_Q_ADMIN_INIT_TIMEOUT	10    /* Count 10 times */
>  #define ICE_CTL_Q_ADMIN_INIT_MSEC	100   /* Check every 100msec */
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next 2/4] ice: sleep, don't busy-wait, for sq_cmd_timeout
@ 2023-04-02 11:18     ` Simon Horman
  0 siblings, 0 replies; 20+ messages in thread
From: Simon Horman @ 2023-04-02 11:18 UTC (permalink / raw)
  To: Michal Schmidt
  Cc: netdev, Jesse Brandeburg, Karol Kolacinski, Tony Nguyen, intel-wired-lan

On Sat, Apr 01, 2023 at 07:26:57PM +0200, Michal Schmidt wrote:
> The driver polls for ice_sq_done() with a 100 µs period for up to 1 s
> and it uses udelay to do that.
> 
> Let's use usleep_range instead. We know sleeping is allowed here,
> because we're holding a mutex (cq->sq_lock). To preserve the total
> max waiting time, measure cq->sq_cmd_timeout in jiffies.
> 
> The sq_cmd_timeout is referenced also in ice_release_res(), but there
> the polling period is 1 ms (i.e. 10 times longer). Since the timeout
> was expressed in terms of the number of loops, the total timeout in this
> function is 10 s. I do not know if this is intentional. This patch keeps
> it.
> 
> The patch lowers the CPU usage of the ice-gnss-<dev_name> kernel thread
> on my system from ~8 % to less than 1 %.
> I saw a report of high CPU usage with ptp4l where the busy-waiting in
> ice_sq_send_cmd dominated the profile. The patch should help with that.
> 
> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_common.c   | 14 +++++++-------
>  drivers/net/ethernet/intel/ice/ice_controlq.c |  9 +++++----
>  drivers/net/ethernet/intel/ice/ice_controlq.h |  2 +-
>  3 files changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
> index c2fda4fa4188..14cffe49fa8c 100644
> --- a/drivers/net/ethernet/intel/ice/ice_common.c
> +++ b/drivers/net/ethernet/intel/ice/ice_common.c
> @@ -1992,19 +1992,19 @@ ice_acquire_res(struct ice_hw *hw, enum ice_aq_res_ids res,
>   */
>  void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res)
>  {
> -	u32 total_delay = 0;
> +	unsigned long timeout;
>  	int status;
>  
> -	status = ice_aq_release_res(hw, res, 0, NULL);
> -
>  	/* there are some rare cases when trying to release the resource
>  	 * results in an admin queue timeout, so handle them correctly
>  	 */
> -	while ((status == -EIO) && (total_delay < hw->adminq.sq_cmd_timeout)) {
> -		mdelay(1);
> +	timeout = jiffies + 10 * hw->adminq.sq_cmd_timeout;

Not needed for this series. But it occurs to me that a clean-up would be to
use ICE_CTL_Q_SQ_CMD_TIMEOUT directly and remove the sq_cmd_timeout field,
as it seems to be only set to that constant.

> +	do {
>  		status = ice_aq_release_res(hw, res, 0, NULL);
> -		total_delay++;
> -	}
> +		if (status != -EIO)
> +			break;
> +		usleep_range(1000, 2000);
> +	} while (time_before(jiffies, timeout));
>  }
>  
>  /**
> diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.c b/drivers/net/ethernet/intel/ice/ice_controlq.c
> index 6bcfee295991..10125e8aa555 100644
> --- a/drivers/net/ethernet/intel/ice/ice_controlq.c
> +++ b/drivers/net/ethernet/intel/ice/ice_controlq.c
> @@ -967,7 +967,7 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq,
>  	struct ice_aq_desc *desc_on_ring;
>  	bool cmd_completed = false;
>  	struct ice_sq_cd *details;
> -	u32 total_delay = 0;
> +	unsigned long timeout;
>  	int status = 0;
>  	u16 retval = 0;
>  	u32 val = 0;
> @@ -1060,13 +1060,14 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq,
>  		cq->sq.next_to_use = 0;
>  	wr32(hw, cq->sq.tail, cq->sq.next_to_use);
>  
> +	timeout = jiffies + cq->sq_cmd_timeout;
>  	do {
>  		if (ice_sq_done(hw, cq))
>  			break;
>  
> -		udelay(ICE_CTL_Q_SQ_CMD_USEC);
> -		total_delay++;
> -	} while (total_delay < cq->sq_cmd_timeout);
> +		usleep_range(ICE_CTL_Q_SQ_CMD_USEC,
> +			     ICE_CTL_Q_SQ_CMD_USEC * 3 / 2);
> +	} while (time_before(jiffies, timeout));
>  
>  	/* if ready, copy the desc back to temp */
>  	if (ice_sq_done(hw, cq)) {
> diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.h b/drivers/net/ethernet/intel/ice/ice_controlq.h
> index c07e9cc9fc6e..f2d3b115ae0b 100644
> --- a/drivers/net/ethernet/intel/ice/ice_controlq.h
> +++ b/drivers/net/ethernet/intel/ice/ice_controlq.h
> @@ -34,7 +34,7 @@ enum ice_ctl_q {
>  };
>  
>  /* Control Queue timeout settings - max delay 1s */
> -#define ICE_CTL_Q_SQ_CMD_TIMEOUT	10000 /* Count 10000 times */
> +#define ICE_CTL_Q_SQ_CMD_TIMEOUT	HZ    /* Wait max 1s */
>  #define ICE_CTL_Q_SQ_CMD_USEC		100   /* Check every 100usec */
>  #define ICE_CTL_Q_ADMIN_INIT_TIMEOUT	10    /* Count 10 times */
>  #define ICE_CTL_Q_ADMIN_INIT_MSEC	100   /* Check every 100msec */
> -- 
> 2.39.2
> 
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next 1/4] ice: lower CPU usage of the GNSS read thread
  2023-04-01 18:29     ` [Intel-wired-lan] " Andrew Lunn
@ 2023-04-03 13:36       ` Michal Schmidt
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-03 13:36 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, Jesse Brandeburg, Karol Kolacinski, Tony Nguyen, intel-wired-lan

On Sat, Apr 1, 2023 at 8:31 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sat, Apr 01, 2023 at 07:26:56PM +0200, Michal Schmidt wrote:
> > The ice-gnss-<dev_name> kernel thread, which reads data from the u-blox
> > GNSS module, keep a CPU core almost 100% busy. The main reason is that
> > it busy-waits for data to become available.
>
> Hi Michal
>
> Please could you change the patch subject. Maybe something like "Do
> not busy wait in read" That gives a better idea what the patch does.

And I thought I was doing so well with the subjects :)
OK, I will change it.
Before resending, I would like to get a comment from Intel about that
special 0.1 s interval. If it turns out it is not necessary, I would
simplify the patch further.

Thanks!
Michal

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 1/4] ice: lower CPU usage of the GNSS read thread
@ 2023-04-03 13:36       ` Michal Schmidt
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-03 13:36 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: intel-wired-lan, netdev, Jesse Brandeburg, Tony Nguyen,
	Michal Michalik, Arkadiusz Kubalewski, Karol Kolacinski,
	Petr Oros

On Sat, Apr 1, 2023 at 8:31 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sat, Apr 01, 2023 at 07:26:56PM +0200, Michal Schmidt wrote:
> > The ice-gnss-<dev_name> kernel thread, which reads data from the u-blox
> > GNSS module, keep a CPU core almost 100% busy. The main reason is that
> > it busy-waits for data to become available.
>
> Hi Michal
>
> Please could you change the patch subject. Maybe something like "Do
> not busy wait in read" That gives a better idea what the patch does.

And I thought I was doing so well with the subjects :)
OK, I will change it.
Before resending, I would like to get a comment from Intel about that
special 0.1 s interval. If it turns out it is not necessary, I would
simplify the patch further.

Thanks!
Michal


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next 2/4] ice: sleep, don't busy-wait, for sq_cmd_timeout
  2023-04-02 11:18     ` [Intel-wired-lan] " Simon Horman
@ 2023-04-03 13:42       ` Michal Schmidt
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-03 13:42 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, Jesse Brandeburg, Karol Kolacinski, Tony Nguyen, intel-wired-lan

On Sun, Apr 2, 2023 at 1:18 PM Simon Horman <simon.horman@corigine.com> wrote:
> On Sat, Apr 01, 2023 at 07:26:57PM +0200, Michal Schmidt wrote:
> > The driver polls for ice_sq_done() with a 100 µs period for up to 1 s
> > and it uses udelay to do that.
> >
> > Let's use usleep_range instead. We know sleeping is allowed here,
> > because we're holding a mutex (cq->sq_lock). To preserve the total
> > max waiting time, measure cq->sq_cmd_timeout in jiffies.
> >
> > The sq_cmd_timeout is referenced also in ice_release_res(), but there
> > the polling period is 1 ms (i.e. 10 times longer). Since the timeout
> > was expressed in terms of the number of loops, the total timeout in this
> > function is 10 s. I do not know if this is intentional. This patch keeps
> > it.
> >
> > The patch lowers the CPU usage of the ice-gnss-<dev_name> kernel thread
> > on my system from ~8 % to less than 1 %.
> > I saw a report of high CPU usage with ptp4l where the busy-waiting in
> > ice_sq_send_cmd dominated the profile. The patch should help with that.
> >
> > Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_common.c   | 14 +++++++-------
> >  drivers/net/ethernet/intel/ice/ice_controlq.c |  9 +++++----
> >  drivers/net/ethernet/intel/ice/ice_controlq.h |  2 +-
> >  3 files changed, 13 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
> > index c2fda4fa4188..14cffe49fa8c 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_common.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_common.c
> > @@ -1992,19 +1992,19 @@ ice_acquire_res(struct ice_hw *hw, enum ice_aq_res_ids res,
> >   */
> >  void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res)
> >  {
> > -     u32 total_delay = 0;
> > +     unsigned long timeout;
> >       int status;
> >
> > -     status = ice_aq_release_res(hw, res, 0, NULL);
> > -
> >       /* there are some rare cases when trying to release the resource
> >        * results in an admin queue timeout, so handle them correctly
> >        */
> > -     while ((status == -EIO) && (total_delay < hw->adminq.sq_cmd_timeout)) {
> > -             mdelay(1);
> > +     timeout = jiffies + 10 * hw->adminq.sq_cmd_timeout;
>
> Not needed for this series. But it occurs to me that a clean-up would be to
> use ICE_CTL_Q_SQ_CMD_TIMEOUT directly and remove the sq_cmd_timeout field,
> as it seems to be only set to that constant.

Simon,
You are right. I can do that in v2.
BTW, i40e and iavf are similar to ice here.
Thanks,
Michal

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 2/4] ice: sleep, don't busy-wait, for sq_cmd_timeout
@ 2023-04-03 13:42       ` Michal Schmidt
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Schmidt @ 2023-04-03 13:42 UTC (permalink / raw)
  To: Simon Horman
  Cc: intel-wired-lan, netdev, Jesse Brandeburg, Tony Nguyen,
	Michal Michalik, Arkadiusz Kubalewski, Karol Kolacinski,
	Petr Oros

On Sun, Apr 2, 2023 at 1:18 PM Simon Horman <simon.horman@corigine.com> wrote:
> On Sat, Apr 01, 2023 at 07:26:57PM +0200, Michal Schmidt wrote:
> > The driver polls for ice_sq_done() with a 100 µs period for up to 1 s
> > and it uses udelay to do that.
> >
> > Let's use usleep_range instead. We know sleeping is allowed here,
> > because we're holding a mutex (cq->sq_lock). To preserve the total
> > max waiting time, measure cq->sq_cmd_timeout in jiffies.
> >
> > The sq_cmd_timeout is referenced also in ice_release_res(), but there
> > the polling period is 1 ms (i.e. 10 times longer). Since the timeout
> > was expressed in terms of the number of loops, the total timeout in this
> > function is 10 s. I do not know if this is intentional. This patch keeps
> > it.
> >
> > The patch lowers the CPU usage of the ice-gnss-<dev_name> kernel thread
> > on my system from ~8 % to less than 1 %.
> > I saw a report of high CPU usage with ptp4l where the busy-waiting in
> > ice_sq_send_cmd dominated the profile. The patch should help with that.
> >
> > Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_common.c   | 14 +++++++-------
> >  drivers/net/ethernet/intel/ice/ice_controlq.c |  9 +++++----
> >  drivers/net/ethernet/intel/ice/ice_controlq.h |  2 +-
> >  3 files changed, 13 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
> > index c2fda4fa4188..14cffe49fa8c 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_common.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_common.c
> > @@ -1992,19 +1992,19 @@ ice_acquire_res(struct ice_hw *hw, enum ice_aq_res_ids res,
> >   */
> >  void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res)
> >  {
> > -     u32 total_delay = 0;
> > +     unsigned long timeout;
> >       int status;
> >
> > -     status = ice_aq_release_res(hw, res, 0, NULL);
> > -
> >       /* there are some rare cases when trying to release the resource
> >        * results in an admin queue timeout, so handle them correctly
> >        */
> > -     while ((status == -EIO) && (total_delay < hw->adminq.sq_cmd_timeout)) {
> > -             mdelay(1);
> > +     timeout = jiffies + 10 * hw->adminq.sq_cmd_timeout;
>
> Not needed for this series. But it occurs to me that a clean-up would be to
> use ICE_CTL_Q_SQ_CMD_TIMEOUT directly and remove the sq_cmd_timeout field,
> as it seems to be only set to that constant.

Simon,
You are right. I can do that in v2.
BTW, i40e and iavf are similar to ice here.
Thanks,
Michal


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH net-next 1/4] ice: lower CPU usage of the GNSS read thread
  2023-04-01 17:26   ` [Intel-wired-lan] " Michal Schmidt
@ 2023-04-04  9:25     ` Kolacinski, Karol
  -1 siblings, 0 replies; 20+ messages in thread
From: Kolacinski, Karol @ 2023-04-04  9:25 UTC (permalink / raw)
  To: mschmidt, intel-wired-lan
  Cc: netdev, Brandeburg, Jesse, Nguyen, Anthony L, Michalik, Michal,
	Kubalewski, Arkadiusz, poros

On Sat, Apr 01, 2023 at 07:26:56PM +0200, Michal Schmidt wrote:
> A simple improvement would be to replace the "mdelay(10);" in
> ice_gnss_read() with sleeping. A better fix is to not do any waiting directly in the function and just requeue this delayed work as needed.
> The advantage is that canceling the work from ice_gnss_exit() becomes immediate, rather than taking up to ~2.5 seconds (ICE_MAX_UBX_READ_TRIES
> * 10 ms).
> 
> This lowers the CPU usage of the ice-gnss-<dev_name> thread on my system from ~90 % to ~8 %.
> 
> I am not sure if the larger 0.1 s pause after inserting data into the gnss subsystem is really necessary, but I'm keeping that as it was.

Hi Michal,

We were planning to upstream 20 ms sleep instead of 10 ms delay but your
solution looks better.
To align with our code, ICE_GNSS_POLL_DATA_DELAY_TIME could be increased
to 20 ms.

Thanks,
Karol

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next 1/4] ice: lower CPU usage of the GNSS read thread
@ 2023-04-04  9:25     ` Kolacinski, Karol
  0 siblings, 0 replies; 20+ messages in thread
From: Kolacinski, Karol @ 2023-04-04  9:25 UTC (permalink / raw)
  To: mschmidt, intel-wired-lan; +Cc: netdev, Brandeburg, Jesse, Nguyen, Anthony L

On Sat, Apr 01, 2023 at 07:26:56PM +0200, Michal Schmidt wrote:
> A simple improvement would be to replace the "mdelay(10);" in
> ice_gnss_read() with sleeping. A better fix is to not do any waiting directly in the function and just requeue this delayed work as needed.
> The advantage is that canceling the work from ice_gnss_exit() becomes immediate, rather than taking up to ~2.5 seconds (ICE_MAX_UBX_READ_TRIES
> * 10 ms).
> 
> This lowers the CPU usage of the ice-gnss-<dev_name> thread on my system from ~90 % to ~8 %.
> 
> I am not sure if the larger 0.1 s pause after inserting data into the gnss subsystem is really necessary, but I'm keeping that as it was.

Hi Michal,

We were planning to upstream 20 ms sleep instead of 10 ms delay but your
solution looks better.
To align with our code, ICE_GNSS_POLL_DATA_DELAY_TIME could be increased
to 20 ms.

Thanks,
Karol
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-04-04  9:25 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-01 17:26 [PATCH net-next 0/4] ice: lower CPU usage with GNSS Michal Schmidt
2023-04-01 17:26 ` [Intel-wired-lan] " Michal Schmidt
2023-04-01 17:26 ` [PATCH net-next 1/4] ice: lower CPU usage of the GNSS read thread Michal Schmidt
2023-04-01 17:26   ` [Intel-wired-lan] " Michal Schmidt
2023-04-01 18:29   ` Andrew Lunn
2023-04-01 18:29     ` [Intel-wired-lan] " Andrew Lunn
2023-04-03 13:36     ` Michal Schmidt
2023-04-03 13:36       ` Michal Schmidt
2023-04-04  9:25   ` Kolacinski, Karol
2023-04-04  9:25     ` [Intel-wired-lan] " Kolacinski, Karol
2023-04-01 17:26 ` [PATCH net-next 2/4] ice: sleep, don't busy-wait, for sq_cmd_timeout Michal Schmidt
2023-04-01 17:26   ` [Intel-wired-lan] " Michal Schmidt
2023-04-02 11:18   ` Simon Horman
2023-04-02 11:18     ` [Intel-wired-lan] " Simon Horman
2023-04-03 13:42     ` Michal Schmidt
2023-04-03 13:42       ` Michal Schmidt
2023-04-01 17:26 ` [PATCH net-next 3/4] ice: remove unused buffer copy code in ice_sq_send_cmd_retry() Michal Schmidt
2023-04-01 17:26   ` [Intel-wired-lan] " Michal Schmidt
2023-04-01 17:26 ` [PATCH net-next 4/4] ice: sleep, don't busy-wait, in the SQ send retry loop Michal Schmidt
2023-04-01 17:26   ` [Intel-wired-lan] " Michal Schmidt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.