All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] virtio-net: a fix and some updates for virtio dim
@ 2024-03-21 11:45 Heng Qi
  2024-03-21 11:45 ` [PATCH 1/2] virtio-net: fix possible dim status unrecoverable Heng Qi
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Heng Qi @ 2024-03-21 11:45 UTC (permalink / raw)
  To: netdev, virtualization, Jason Wang, Michael S. Tsirkin,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, David S. Miller,
	Xuan Zhuo

Patch 1 fixes an existing bug. Belongs to the net branch.
Patch 2 attempts to modify the sending of dim cmds to an asynchronous way.

Heng Qi (2):
  virtio-net: fix possible dim status unrecoverable
  virtio-net: reduce the CPU consumption of dim worker

 drivers/net/virtio_net.c | 273 ++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 246 insertions(+), 27 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/2] virtio-net: fix possible dim status unrecoverable
  2024-03-21 11:45 [PATCH 0/2] virtio-net: a fix and some updates for virtio dim Heng Qi
@ 2024-03-21 11:45 ` Heng Qi
  2024-03-22  5:17   ` Jason Wang
  2024-03-21 11:45 ` [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker Heng Qi
  2024-03-21 12:25 ` [PATCH 0/2] virtio-net: a fix and some updates for virtio dim Jiri Pirko
  2 siblings, 1 reply; 24+ messages in thread
From: Heng Qi @ 2024-03-21 11:45 UTC (permalink / raw)
  To: netdev, virtualization, Jason Wang, Michael S. Tsirkin,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, David S. Miller,
	Xuan Zhuo

When the dim worker is scheduled, if it fails to acquire the lock,
dim may not be able to return to the working state later.

For example, the following single queue scenario:
  1. The dim worker of rxq0 is scheduled, and the dim status is
     changed to DIM_APPLY_NEW_PROFILE;
  2. The ethtool command is holding rtnl lock;
  3. Since the rtnl lock is already held, virtnet_rx_dim_work fails
     to acquire the lock and exits;

Then, even if net_dim is invoked again, it cannot work because the
state is not restored to DIM_START_MEASURE.

Fixes: 6208799553a8 ("virtio-net: support rx netdim")
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
---
 drivers/net/virtio_net.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index c22d111..0ebe322 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3563,8 +3563,10 @@ static void virtnet_rx_dim_work(struct work_struct *work)
 	struct dim_cq_moder update_moder;
 	int i, qnum, err;
 
-	if (!rtnl_trylock())
+	if (!rtnl_trylock()) {
+		schedule_work(&dim->work);
 		return;
+	}
 
 	/* Each rxq's work is queued by "net_dim()->schedule_work()"
 	 * in response to NAPI traffic changes. Note that dim->profile_ix
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-21 11:45 [PATCH 0/2] virtio-net: a fix and some updates for virtio dim Heng Qi
  2024-03-21 11:45 ` [PATCH 1/2] virtio-net: fix possible dim status unrecoverable Heng Qi
@ 2024-03-21 11:45 ` Heng Qi
  2024-03-22  2:03   ` kernel test robot
                     ` (2 more replies)
  2024-03-21 12:25 ` [PATCH 0/2] virtio-net: a fix and some updates for virtio dim Jiri Pirko
  2 siblings, 3 replies; 24+ messages in thread
From: Heng Qi @ 2024-03-21 11:45 UTC (permalink / raw)
  To: netdev, virtualization, Jason Wang, Michael S. Tsirkin,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, David S. Miller,
	Xuan Zhuo

Currently, ctrlq processes commands in a synchronous manner,
which increases the delay of dim commands when configuring
multi-queue VMs, which in turn causes the CPU utilization to
increase and interferes with the performance of dim.

Therefore we asynchronously process ctlq's dim commands.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
---
 drivers/net/virtio_net.c | 269 ++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 243 insertions(+), 26 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 0ebe322..460fc9e 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -138,6 +138,13 @@ struct virtnet_interrupt_coalesce {
 	u32 max_usecs;
 };
 
+struct virtnet_coal_node {
+	struct virtio_net_ctrl_hdr hdr;
+	virtio_net_ctrl_ack status;
+	struct virtio_net_ctrl_coal_vq coal_vqs;
+	struct list_head list;
+};
+
 /* The dma information of pages allocated at a time. */
 struct virtnet_rq_dma {
 	dma_addr_t addr;
@@ -300,6 +307,9 @@ struct virtnet_info {
 	/* Work struct for delayed refilling if we run low on memory. */
 	struct delayed_work refill;
 
+	/* Work struct for delayed acquisition of cvq processing results. */
+	struct delayed_work get_cvq;
+
 	/* Is delayed refill enabled? */
 	bool refill_enabled;
 
@@ -332,6 +342,10 @@ struct virtnet_info {
 	bool rx_dim_enabled;
 
 	/* Interrupt coalescing settings */
+	int cvq_cmd_nums;
+	int batch_dim_nums;
+	int dim_loop_index;
+	struct list_head coal_list;
 	struct virtnet_interrupt_coalesce intr_coal_tx;
 	struct virtnet_interrupt_coalesce intr_coal_rx;
 
@@ -2522,6 +2536,64 @@ static int virtnet_tx_resize(struct virtnet_info *vi,
 	return err;
 }
 
+static void virtnet_process_dim_cmd(struct virtnet_info *vi, void *res)
+{
+	struct virtnet_coal_node *coal_node;
+	u16 queue;
+
+	vi->cvq_cmd_nums--;
+
+	coal_node = (struct virtnet_coal_node *)res;
+	list_add(&coal_node->list, &vi->coal_list);
+
+	queue = le16_to_cpu(coal_node->coal_vqs.vqn) / 2;
+	vi->rq[queue].dim.state = DIM_START_MEASURE;
+}
+
+/**
+ * virtnet_cvq_response - get the response for filled ctrlq requests
+ * @poll: keep polling ctrlq when a NULL buffer is obtained.
+ * @dim_oneshot: process a dim cmd then exit, excluding user commands.
+ *
+ * Note that user commands must be processed synchronously
+ *  (poll = true, dim_oneshot = false).
+ */
+static void virtnet_cvq_response(struct virtnet_info *vi,
+				 bool poll,
+				 bool dim_oneshot)
+{
+	unsigned tmp;
+	void *res;
+
+	while (true) {
+		res = virtqueue_get_buf(vi->cvq, &tmp);
+		if (virtqueue_is_broken(vi->cvq)) {
+			dev_warn(&vi->dev->dev, "Control vq is broken.\n");
+			return;
+		}
+
+		if (!res) {
+			if (!poll)
+				return;
+
+			cond_resched();
+			cpu_relax();
+			continue;
+		}
+
+		/* this does not occur inside the process of waiting dim */
+		if (res == ((void *)vi))
+			return;
+
+		virtnet_process_dim_cmd(vi, res);
+		/* When it is a user command, we must wait until the
+		 * processing result is processed synchronously.
+		 */
+		if (dim_oneshot)
+			return;
+	}
+}
+
 /*
  * Send command via the control virtqueue and check status.  Commands
  * supported by the hypervisor, as indicated by feature bits, should
@@ -2531,7 +2603,7 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
 				 struct scatterlist *out)
 {
 	struct scatterlist *sgs[4], hdr, stat;
-	unsigned out_num = 0, tmp;
+	unsigned out_num = 0;
 	int ret;
 
 	/* Caller should know better */
@@ -2552,6 +2624,13 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
 	sgs[out_num] = &stat;
 
 	BUG_ON(out_num + 1 > ARRAY_SIZE(sgs));
+
+	/* The additional task (dim) consumes the descriptor asynchronously,
+	 * so we must ensure that there is a location for us.
+	 */
+	if (vi->cvq->num_free <= 3)
+		virtnet_cvq_response(vi, true, true);
+
 	ret = virtqueue_add_sgs(vi->cvq, sgs, out_num, 1, vi, GFP_ATOMIC);
 	if (ret < 0) {
 		dev_warn(&vi->vdev->dev,
@@ -2565,11 +2644,7 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
 	/* Spin for a response, the kick causes an ioport write, trapping
 	 * into the hypervisor, so the request should be handled immediately.
 	 */
-	while (!virtqueue_get_buf(vi->cvq, &tmp) &&
-	       !virtqueue_is_broken(vi->cvq)) {
-		cond_resched();
-		cpu_relax();
-	}
+	virtnet_cvq_response(vi, true, false);
 
 	return vi->ctrl->status == VIRTIO_NET_OK;
 }
@@ -2721,6 +2796,7 @@ static int virtnet_close(struct net_device *dev)
 		cancel_work_sync(&vi->rq[i].dim.work);
 	}
 
+	cancel_delayed_work_sync(&vi->get_cvq);
 	return 0;
 }
 
@@ -3553,48 +3629,148 @@ static int virtnet_send_notf_coal_vq_cmds(struct virtnet_info *vi,
 	return 0;
 }
 
+static bool virtnet_add_dim_command(struct virtnet_info *vi,
+				    struct virtnet_coal_node *ctrl)
+{
+	struct scatterlist *sgs[4], hdr, stat, out;
+	unsigned out_num = 0;
+	int ret;
+
+	BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ));
+
+	ctrl->hdr.class = VIRTIO_NET_CTRL_NOTF_COAL;
+	ctrl->hdr.cmd = VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET;
+
+	sg_init_one(&hdr, &ctrl->hdr, sizeof(ctrl->hdr));
+	sgs[out_num++] = &hdr;
+
+	sg_init_one(&out, &ctrl->coal_vqs, sizeof(ctrl->coal_vqs));
+	sgs[out_num++] = &out;
+
+	ctrl->status = VIRTIO_NET_OK;
+	sg_init_one(&stat, &ctrl->status, sizeof(ctrl->status));
+	sgs[out_num] = &stat;
+
+	BUG_ON(out_num + 1 > ARRAY_SIZE(sgs));
+	ret = virtqueue_add_sgs(vi->cvq, sgs, out_num, 1, ctrl, GFP_ATOMIC);
+	if (ret < 0) {
+		dev_warn(&vi->vdev->dev, "Failed to add sgs for command vq: %d\n.", ret);
+		return false;
+	}
+
+	virtqueue_kick(vi->cvq);
+
+	vi->cvq_cmd_nums++;
+
+	return true;
+}
+
+static void virtnet_get_cvq_work(struct work_struct *work)
+{
+	struct virtnet_info *vi =
+		container_of(work, struct virtnet_info, get_cvq.work);
+
+	if (!rtnl_trylock()) {
+		schedule_delayed_work(&vi->get_cvq, 1);
+		return;
+	}
+
+	if (!vi->cvq_cmd_nums)
+		goto ret;
+
+	virtnet_cvq_response(vi, false, false);
+
+	if (vi->cvq_cmd_nums)
+		schedule_delayed_work(&vi->get_cvq, 1);
+
+ret:
+	rtnl_unlock();
+}
+
+static int virtnet_config_dim(struct virtnet_info *vi, struct receive_queue *rq,
+			      struct dim *dim)
+{
+	struct virtnet_coal_node *avail_coal;
+	struct dim_cq_moder update_moder;
+	int qnum = rq - vi->rq;
+
+	update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
+	if (update_moder.usec != rq->intr_coal.max_usecs ||
+	    update_moder.pkts != rq->intr_coal.max_packets) {
+		avail_coal = list_first_entry(&vi->coal_list,
+					      struct virtnet_coal_node, list);
+		avail_coal->coal_vqs.vqn = cpu_to_le16(rxq2vq(qnum));
+		avail_coal->coal_vqs.coal.max_usecs = cpu_to_le32(update_moder.usec);
+		avail_coal->coal_vqs.coal.max_packets = cpu_to_le32(update_moder.pkts);
+		list_del(&avail_coal->list);
+		if (!virtnet_add_dim_command(vi, avail_coal))
+			return -EINVAL;
+
+		rq->intr_coal.max_usecs = update_moder.usec;
+		rq->intr_coal.max_packets = update_moder.pkts;
+	} else if (dim->state == DIM_APPLY_NEW_PROFILE) {
+		dim->state = DIM_START_MEASURE;
+	}
+
+	return 0;
+}
+
 static void virtnet_rx_dim_work(struct work_struct *work)
 {
 	struct dim *dim = container_of(work, struct dim, work);
-	struct receive_queue *rq = container_of(dim,
+	struct receive_queue *rq, *rq_ = container_of(dim,
 			struct receive_queue, dim);
-	struct virtnet_info *vi = rq->vq->vdev->priv;
-	struct net_device *dev = vi->dev;
-	struct dim_cq_moder update_moder;
-	int i, qnum, err;
+	struct virtnet_info *vi = rq_->vq->vdev->priv;
+	int i = 0, err;
 
 	if (!rtnl_trylock()) {
 		schedule_work(&dim->work);
 		return;
 	}
 
+	if (list_empty(&vi->coal_list) || vi->cvq->num_free <= 3)
+		virtnet_cvq_response(vi, true, true);
+
+	/* The request scheduling the worker must be processed first
+	 * to avoid not having enough descs for ctrlq, causing the
+	 * request to fail, and the parameters of the queue will never
+	 * be updated again in the future.
+	 */
+	err = virtnet_config_dim(vi, rq_, dim);
+	if (err)
+		goto ret;
+
 	/* Each rxq's work is queued by "net_dim()->schedule_work()"
 	 * in response to NAPI traffic changes. Note that dim->profile_ix
 	 * for each rxq is updated prior to the queuing action.
 	 * So we only need to traverse and update profiles for all rxqs
 	 * in the work which is holding rtnl_lock.
 	 */
-	for (i = 0; i < vi->curr_queue_pairs; i++) {
+	for (i = vi->dim_loop_index; i < vi->curr_queue_pairs; i++) {
 		rq = &vi->rq[i];
 		dim = &rq->dim;
-		qnum = rq - vi->rq;
 
-		if (!rq->dim_enabled)
+		if (list_empty(&vi->coal_list) || vi->cvq->num_free <= 3)
+			break;
+
+		if (!rq->dim_enabled || rq == rq_)
 			continue;
 
-		update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
-		if (update_moder.usec != rq->intr_coal.max_usecs ||
-		    update_moder.pkts != rq->intr_coal.max_packets) {
-			err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum,
-							       update_moder.usec,
-							       update_moder.pkts);
-			if (err)
-				pr_debug("%s: Failed to send dim parameters on rxq%d\n",
-					 dev->name, qnum);
-			dim->state = DIM_START_MEASURE;
-		}
+		err = virtnet_config_dim(vi, rq, dim);
+		if (err)
+			goto ret;
+
 	}
 
+	if (vi->cvq_cmd_nums)
+		schedule_delayed_work(&vi->get_cvq, 1);
+
+ret:
+	if (i == vi->curr_queue_pairs)
+		vi->dim_loop_index = 0;
+	else
+		vi->dim_loop_index = i;
+
 	rtnl_unlock();
 }
 
@@ -4439,6 +4615,7 @@ static int virtnet_alloc_queues(struct virtnet_info *vi)
 		goto err_rq;
 
 	INIT_DELAYED_WORK(&vi->refill, refill_work);
+	INIT_DELAYED_WORK(&vi->get_cvq, virtnet_get_cvq_work);
 	for (i = 0; i < vi->max_queue_pairs; i++) {
 		vi->rq[i].pages = NULL;
 		netif_napi_add_weight(vi->dev, &vi->rq[i].napi, virtnet_poll,
@@ -4623,6 +4800,35 @@ static void virtnet_set_big_packets(struct virtnet_info *vi, const int mtu)
 	}
 }
 
+static void virtnet_del_coal_list(struct virtnet_info *vi)
+{
+	struct virtnet_coal_node *coal_node, *tmp;
+
+	list_for_each_entry_safe(coal_node, tmp,  &vi->coal_list, list) {
+		list_del(&coal_node->list);
+		kfree(coal_node);
+	}
+}
+
+static int virtnet_init_coal_list(struct virtnet_info *vi)
+{
+	struct virtnet_coal_node *coal_node;
+	int i;
+
+	vi->batch_dim_nums = min((unsigned int)vi->max_queue_pairs,
+				 virtqueue_get_vring_size(vi->cvq) / 3);
+	for (i = 0; i < vi->batch_dim_nums; i++) {
+		coal_node = kmalloc(sizeof(*coal_node), GFP_KERNEL);
+		if (!coal_node) {
+			virtnet_del_coal_list(vi);
+			return -ENOMEM;
+		}
+		list_add(&coal_node->list, &vi->coal_list);
+	}
+
+	return 0;
+}
+
 static int virtnet_probe(struct virtio_device *vdev)
 {
 	int i, err = -ENOMEM;
@@ -4816,11 +5022,20 @@ static int virtnet_probe(struct virtio_device *vdev)
 			vi->intr_coal_tx.max_packets = 0;
 	}
 
+	INIT_LIST_HEAD(&vi->coal_list);
 	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) {
+		vi->cvq_cmd_nums = 0;
+		vi->dim_loop_index = 0;
+
+		if (virtnet_init_coal_list(vi))
+			goto free;
+
 		/* The reason is the same as VIRTIO_NET_F_NOTF_COAL. */
-		for (i = 0; i < vi->max_queue_pairs; i++)
+		for (i = 0; i < vi->max_queue_pairs; i++) {
+			vi->rq[i].packets_in_napi = 0;
 			if (vi->sq[i].napi.weight)
 				vi->sq[i].intr_coal.max_packets = 1;
+		}
 	}
 
 #ifdef CONFIG_SYSFS
@@ -4949,6 +5164,8 @@ static void virtnet_remove(struct virtio_device *vdev)
 
 	net_failover_destroy(vi->failover);
 
+	virtnet_del_coal_list(vi);
+
 	remove_vq_common(vi);
 
 	free_netdev(vi->dev);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/2] virtio-net: a fix and some updates for virtio dim
  2024-03-21 11:45 [PATCH 0/2] virtio-net: a fix and some updates for virtio dim Heng Qi
  2024-03-21 11:45 ` [PATCH 1/2] virtio-net: fix possible dim status unrecoverable Heng Qi
  2024-03-21 11:45 ` [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker Heng Qi
@ 2024-03-21 12:25 ` Jiri Pirko
  2024-03-25  2:23   ` Heng Qi
  2 siblings, 1 reply; 24+ messages in thread
From: Jiri Pirko @ 2024-03-21 12:25 UTC (permalink / raw)
  To: Heng Qi
  Cc: netdev, virtualization, Jason Wang, Michael S. Tsirkin,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, David S. Miller,
	Xuan Zhuo

Thu, Mar 21, 2024 at 12:45:55PM CET, hengqi@linux.alibaba.com wrote:
>Patch 1 fixes an existing bug. Belongs to the net branch.

Send separately with "net" indication in the patch brackets:
https://www.kernel.org/doc/html/next/process/maintainer-netdev.html?highlight=network#tl-dr


>Patch 2 attempts to modify the sending of dim cmds to an asynchronous way.
Not a bugfix, then send separately with "net-next" indication. Net-next
is currently closed, send it next week.


>
>Heng Qi (2):
>  virtio-net: fix possible dim status unrecoverable
>  virtio-net: reduce the CPU consumption of dim worker

The name of the driver is "virtio_net".



pw-bot: cr


>
> drivers/net/virtio_net.c | 273 ++++++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 246 insertions(+), 27 deletions(-)
>
>-- 
>1.8.3.1
>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-21 11:45 ` [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker Heng Qi
@ 2024-03-22  2:03   ` kernel test robot
  2024-03-22  5:19   ` Jason Wang
  2024-03-22  6:50   ` Dan Carpenter
  2 siblings, 0 replies; 24+ messages in thread
From: kernel test robot @ 2024-03-22  2:03 UTC (permalink / raw)
  To: Heng Qi, netdev, virtualization, Jason Wang, Michael S. Tsirkin,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, David S. Miller,
	Xuan Zhuo
  Cc: oe-kbuild-all

Hi Heng,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on next-20240321]
[cannot apply to v6.8]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Heng-Qi/virtio-net-fix-possible-dim-status-unrecoverable/20240321-194759
base:   linus/master
patch link:    https://lore.kernel.org/r/1711021557-58116-3-git-send-email-hengqi%40linux.alibaba.com
patch subject: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
config: x86_64-defconfig (https://download.01.org/0day-ci/archive/20240322/202403220916.cSUxehuW-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-12) 11.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240322/202403220916.cSUxehuW-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202403220916.cSUxehuW-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/net/virtio_net.c:2564: warning: Function parameter or struct member 'vi' not described in 'virtnet_cvq_response'


vim +2564 drivers/net/virtio_net.c

  2552	
  2553	/**
  2554	 * virtnet_cvq_response - get the response for filled ctrlq requests
  2555	 * @poll: keep polling ctrlq when a NULL buffer is obtained.
  2556	 * @dim_oneshot: process a dim cmd then exit, excluding user commands.
  2557	 *
  2558	 * Note that user commands must be processed synchronously
  2559	 *  (poll = true, dim_oneshot = false).
  2560	 */
  2561	static void virtnet_cvq_response(struct virtnet_info *vi,
  2562					 bool poll,
  2563					 bool dim_oneshot)
> 2564	{
  2565		unsigned tmp;
  2566		void *res;
  2567	
  2568		while (true) {
  2569			res = virtqueue_get_buf(vi->cvq, &tmp);
  2570			if (virtqueue_is_broken(vi->cvq)) {
  2571				dev_warn(&vi->dev->dev, "Control vq is broken.\n");
  2572				return;
  2573			}
  2574	
  2575			if (!res) {
  2576				if (!poll)
  2577					return;
  2578	
  2579				cond_resched();
  2580				cpu_relax();
  2581				continue;
  2582			}
  2583	
  2584			/* this does not occur inside the process of waiting dim */
  2585			if (res == ((void *)vi))
  2586				return;
  2587	
  2588			virtnet_process_dim_cmd(vi, res);
  2589			/* When it is a user command, we must wait until the
  2590			 * processing result is processed synchronously.
  2591			 */
  2592			if (dim_oneshot)
  2593				return;
  2594		}
  2595	}
  2596	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] virtio-net: fix possible dim status unrecoverable
  2024-03-21 11:45 ` [PATCH 1/2] virtio-net: fix possible dim status unrecoverable Heng Qi
@ 2024-03-22  5:17   ` Jason Wang
  2024-03-25  2:11     ` Heng Qi
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Wang @ 2024-03-22  5:17 UTC (permalink / raw)
  To: Heng Qi
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo

On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
> When the dim worker is scheduled, if it fails to acquire the lock,
> dim may not be able to return to the working state later.
>
> For example, the following single queue scenario:
>   1. The dim worker of rxq0 is scheduled, and the dim status is
>      changed to DIM_APPLY_NEW_PROFILE;
>   2. The ethtool command is holding rtnl lock;
>   3. Since the rtnl lock is already held, virtnet_rx_dim_work fails
>      to acquire the lock and exits;
>
> Then, even if net_dim is invoked again, it cannot work because the
> state is not restored to DIM_START_MEASURE.
>
> Fixes: 6208799553a8 ("virtio-net: support rx netdim")
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> ---
>  drivers/net/virtio_net.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index c22d111..0ebe322 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3563,8 +3563,10 @@ static void virtnet_rx_dim_work(struct work_struct *work)
>         struct dim_cq_moder update_moder;
>         int i, qnum, err;
>
> -       if (!rtnl_trylock())
> +       if (!rtnl_trylock()) {
> +               schedule_work(&dim->work);
>                 return;
> +       }

Patch looks fine but I wonder if a delayed schedule is better.

Thanks

>
>         /* Each rxq's work is queued by "net_dim()->schedule_work()"
>          * in response to NAPI traffic changes. Note that dim->profile_ix
> --
> 1.8.3.1
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-21 11:45 ` [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker Heng Qi
  2024-03-22  2:03   ` kernel test robot
@ 2024-03-22  5:19   ` Jason Wang
  2024-03-25  2:21     ` Heng Qi
  2024-03-22  6:50   ` Dan Carpenter
  2 siblings, 1 reply; 24+ messages in thread
From: Jason Wang @ 2024-03-22  5:19 UTC (permalink / raw)
  To: Heng Qi
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo

On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
> Currently, ctrlq processes commands in a synchronous manner,
> which increases the delay of dim commands when configuring
> multi-queue VMs, which in turn causes the CPU utilization to
> increase and interferes with the performance of dim.
>
> Therefore we asynchronously process ctlq's dim commands.
>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>

I may miss some previous discussions.

But at least the changelog needs to explain why you don't use interrupt.

Thanks


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-21 11:45 ` [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker Heng Qi
  2024-03-22  2:03   ` kernel test robot
  2024-03-22  5:19   ` Jason Wang
@ 2024-03-22  6:50   ` Dan Carpenter
  2 siblings, 0 replies; 24+ messages in thread
From: Dan Carpenter @ 2024-03-22  6:50 UTC (permalink / raw)
  To: oe-kbuild, Heng Qi, netdev, virtualization, Jason Wang,
	Michael S. Tsirkin, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	David S. Miller, Xuan Zhuo
  Cc: lkp, oe-kbuild-all

Hi Heng,

kernel test robot noticed the following build warnings:

https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Heng-Qi/virtio-net-fix-possible-dim-status-unrecoverable/20240321-194759
base:   linus/master
patch link:    https://lore.kernel.org/r/1711021557-58116-3-git-send-email-hengqi%40linux.alibaba.com
patch subject: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
config: i386-randconfig-141-20240322 (https://download.01.org/0day-ci/archive/20240322/202403221133.J66oueZh-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
| Closes: https://lore.kernel.org/r/202403221133.J66oueZh-lkp@intel.com/

New smatch warnings:
drivers/net/virtio_net.c:5031 virtnet_probe() warn: missing error code 'err'

vim +/err +5031 drivers/net/virtio_net.c

986a4f4d452dec Jason Wang            2012-12-07  5006  	/* Allocate/initialize the rx/tx queues, and invoke find_vqs */
3f9c10b0d478a3 Amit Shah             2011-12-22  5007  	err = init_vqs(vi);
d2a7ddda9ffb1c Michael S. Tsirkin    2009-06-12  5008  	if (err)
d7dfc5cf56b0e3 Toshiaki Makita       2018-01-17  5009  		goto free;
296f96fcfc160e Rusty Russell         2007-10-22  5010  
3014a0d54820d2 Heng Qi               2023-10-08  5011  	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_NOTF_COAL)) {
3014a0d54820d2 Heng Qi               2023-10-08  5012  		vi->intr_coal_rx.max_usecs = 0;
3014a0d54820d2 Heng Qi               2023-10-08  5013  		vi->intr_coal_tx.max_usecs = 0;
3014a0d54820d2 Heng Qi               2023-10-08  5014  		vi->intr_coal_rx.max_packets = 0;
3014a0d54820d2 Heng Qi               2023-10-08  5015  
3014a0d54820d2 Heng Qi               2023-10-08  5016  		/* Keep the default values of the coalescing parameters
3014a0d54820d2 Heng Qi               2023-10-08  5017  		 * aligned with the default napi_tx state.
3014a0d54820d2 Heng Qi               2023-10-08  5018  		 */
3014a0d54820d2 Heng Qi               2023-10-08  5019  		if (vi->sq[0].napi.weight)
3014a0d54820d2 Heng Qi               2023-10-08  5020  			vi->intr_coal_tx.max_packets = 1;
3014a0d54820d2 Heng Qi               2023-10-08  5021  		else
3014a0d54820d2 Heng Qi               2023-10-08  5022  			vi->intr_coal_tx.max_packets = 0;
3014a0d54820d2 Heng Qi               2023-10-08  5023  	}
3014a0d54820d2 Heng Qi               2023-10-08  5024  
d8cd72f1622753 Heng Qi               2024-03-21  5025  	INIT_LIST_HEAD(&vi->coal_list);
3014a0d54820d2 Heng Qi               2023-10-08  5026  	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) {
d8cd72f1622753 Heng Qi               2024-03-21  5027  		vi->cvq_cmd_nums = 0;
d8cd72f1622753 Heng Qi               2024-03-21  5028  		vi->dim_loop_index = 0;
d8cd72f1622753 Heng Qi               2024-03-21  5029  
d8cd72f1622753 Heng Qi               2024-03-21  5030  		if (virtnet_init_coal_list(vi))
d8cd72f1622753 Heng Qi               2024-03-21 @5031  			goto free;

This should probably set the error code.

	err = virtnet_init_coal_list(vi);
	if (err)
		goto free;

d8cd72f1622753 Heng Qi               2024-03-21  5032  
3014a0d54820d2 Heng Qi               2023-10-08  5033  		/* The reason is the same as VIRTIO_NET_F_NOTF_COAL. */
d8cd72f1622753 Heng Qi               2024-03-21  5034  		for (i = 0; i < vi->max_queue_pairs; i++) {
d8cd72f1622753 Heng Qi               2024-03-21  5035  			vi->rq[i].packets_in_napi = 0;
3014a0d54820d2 Heng Qi               2023-10-08  5036  			if (vi->sq[i].napi.weight)
3014a0d54820d2 Heng Qi               2023-10-08  5037  				vi->sq[i].intr_coal.max_packets = 1;
3014a0d54820d2 Heng Qi               2023-10-08  5038  		}
d8cd72f1622753 Heng Qi               2024-03-21  5039  	}

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] virtio-net: fix possible dim status unrecoverable
  2024-03-22  5:17   ` Jason Wang
@ 2024-03-25  2:11     ` Heng Qi
  2024-03-25  6:29       ` Jason Wang
  0 siblings, 1 reply; 24+ messages in thread
From: Heng Qi @ 2024-03-25  2:11 UTC (permalink / raw)
  To: Jason Wang
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo



在 2024/3/22 下午1:17, Jason Wang 写道:
> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>> When the dim worker is scheduled, if it fails to acquire the lock,
>> dim may not be able to return to the working state later.
>>
>> For example, the following single queue scenario:
>>    1. The dim worker of rxq0 is scheduled, and the dim status is
>>       changed to DIM_APPLY_NEW_PROFILE;
>>    2. The ethtool command is holding rtnl lock;
>>    3. Since the rtnl lock is already held, virtnet_rx_dim_work fails
>>       to acquire the lock and exits;
>>
>> Then, even if net_dim is invoked again, it cannot work because the
>> state is not restored to DIM_START_MEASURE.
>>
>> Fixes: 6208799553a8 ("virtio-net: support rx netdim")
>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>> ---
>>   drivers/net/virtio_net.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index c22d111..0ebe322 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -3563,8 +3563,10 @@ static void virtnet_rx_dim_work(struct work_struct *work)
>>          struct dim_cq_moder update_moder;
>>          int i, qnum, err;
>>
>> -       if (!rtnl_trylock())
>> +       if (!rtnl_trylock()) {
>> +               schedule_work(&dim->work);
>>                  return;
>> +       }
> Patch looks fine but I wonder if a delayed schedule is better.

The work in net_dim() core layer uses non-delayed-work, and the two 
cannot be mixed.

Thanks,
Heng

>
> Thanks
>
>>          /* Each rxq's work is queued by "net_dim()->schedule_work()"
>>           * in response to NAPI traffic changes. Note that dim->profile_ix
>> --
>> 1.8.3.1
>>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-22  5:19   ` Jason Wang
@ 2024-03-25  2:21     ` Heng Qi
  2024-03-25  5:57       ` Jason Wang
  0 siblings, 1 reply; 24+ messages in thread
From: Heng Qi @ 2024-03-25  2:21 UTC (permalink / raw)
  To: Jason Wang
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo



在 2024/3/22 下午1:19, Jason Wang 写道:
> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>> Currently, ctrlq processes commands in a synchronous manner,
>> which increases the delay of dim commands when configuring
>> multi-queue VMs, which in turn causes the CPU utilization to
>> increase and interferes with the performance of dim.
>>
>> Therefore we asynchronously process ctlq's dim commands.
>>
>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> I may miss some previous discussions.
>
> But at least the changelog needs to explain why you don't use interrupt.

Will add, but reply here first.

When upgrading the driver's ctrlq to use interrupt, problems may occur 
with some existing devices.
For example, when existing devices are replaced with new drivers, they 
may not work.
Or, if the guest OS supported by the new device is replaced by an old 
downstream OS product, it will not be usable.

Although, ctrlq has the same capabilities as IOq in the virtio spec, 
this does have historical baggage.

Thanks,
Heng

>
> Thanks


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/2] virtio-net: a fix and some updates for virtio dim
  2024-03-21 12:25 ` [PATCH 0/2] virtio-net: a fix and some updates for virtio dim Jiri Pirko
@ 2024-03-25  2:23   ` Heng Qi
  0 siblings, 0 replies; 24+ messages in thread
From: Heng Qi @ 2024-03-25  2:23 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, virtualization, Jason Wang, Michael S. Tsirkin,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, David S. Miller,
	Xuan Zhuo



在 2024/3/21 下午8:25, Jiri Pirko 写道:
> Thu, Mar 21, 2024 at 12:45:55PM CET, hengqi@linux.alibaba.com wrote:
>> Patch 1 fixes an existing bug. Belongs to the net branch.
> Send separately with "net" indication in the patch brackets:
> https://www.kernel.org/doc/html/next/process/maintainer-netdev.html?highlight=network#tl-dr
>
>
>> Patch 2 attempts to modify the sending of dim cmds to an asynchronous way.
> Not a bugfix, then send separately with "net-next" indication. Net-next
> is currently closed, send it next week.

Will do.

>
>
>> Heng Qi (2):
>>   virtio-net: fix possible dim status unrecoverable
>>   virtio-net: reduce the CPU consumption of dim worker
> The name of the driver is "virtio_net".

'git log --grep="^virtio-net:" --oneline' will see a large number of 
"virtio-net" names.

>
>
>
> pw-bot: cr
>
>
>> drivers/net/virtio_net.c | 273 ++++++++++++++++++++++++++++++++++++++++++-----
>> 1 file changed, 246 insertions(+), 27 deletions(-)
>>
>> -- 
>> 1.8.3.1
>>
>>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-25  2:21     ` Heng Qi
@ 2024-03-25  5:57       ` Jason Wang
  2024-03-25  7:17         ` Heng Qi
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Wang @ 2024-03-25  5:57 UTC (permalink / raw)
  To: Heng Qi
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo

On Mon, Mar 25, 2024 at 10:21 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
>
>
> 在 2024/3/22 下午1:19, Jason Wang 写道:
> > On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >> Currently, ctrlq processes commands in a synchronous manner,
> >> which increases the delay of dim commands when configuring
> >> multi-queue VMs, which in turn causes the CPU utilization to
> >> increase and interferes with the performance of dim.
> >>
> >> Therefore we asynchronously process ctlq's dim commands.
> >>
> >> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > I may miss some previous discussions.
> >
> > But at least the changelog needs to explain why you don't use interrupt.
>
> Will add, but reply here first.
>
> When upgrading the driver's ctrlq to use interrupt, problems may occur
> with some existing devices.
> For example, when existing devices are replaced with new drivers, they
> may not work.
> Or, if the guest OS supported by the new device is replaced by an old
> downstream OS product, it will not be usable.
>
> Although, ctrlq has the same capabilities as IOq in the virtio spec,
> this does have historical baggage.

I don't think the upstream Linux drivers need to workaround buggy
devices. Or it is a good excuse to block configure interrupts.

And I remember you told us your device doesn't have such an issue.

Thanks

>
> Thanks,
> Heng
>
> >
> > Thanks
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] virtio-net: fix possible dim status unrecoverable
  2024-03-25  2:11     ` Heng Qi
@ 2024-03-25  6:29       ` Jason Wang
  2024-03-25  6:57         ` Heng Qi
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Wang @ 2024-03-25  6:29 UTC (permalink / raw)
  To: Heng Qi
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo

On Mon, Mar 25, 2024 at 10:11 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
>
>
> 在 2024/3/22 下午1:17, Jason Wang 写道:
> > On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >> When the dim worker is scheduled, if it fails to acquire the lock,
> >> dim may not be able to return to the working state later.
> >>
> >> For example, the following single queue scenario:
> >>    1. The dim worker of rxq0 is scheduled, and the dim status is
> >>       changed to DIM_APPLY_NEW_PROFILE;
> >>    2. The ethtool command is holding rtnl lock;
> >>    3. Since the rtnl lock is already held, virtnet_rx_dim_work fails
> >>       to acquire the lock and exits;
> >>
> >> Then, even if net_dim is invoked again, it cannot work because the
> >> state is not restored to DIM_START_MEASURE.
> >>
> >> Fixes: 6208799553a8 ("virtio-net: support rx netdim")
> >> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> >> ---
> >>   drivers/net/virtio_net.c | 4 +++-
> >>   1 file changed, 3 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >> index c22d111..0ebe322 100644
> >> --- a/drivers/net/virtio_net.c
> >> +++ b/drivers/net/virtio_net.c
> >> @@ -3563,8 +3563,10 @@ static void virtnet_rx_dim_work(struct work_struct *work)
> >>          struct dim_cq_moder update_moder;
> >>          int i, qnum, err;
> >>
> >> -       if (!rtnl_trylock())
> >> +       if (!rtnl_trylock()) {
> >> +               schedule_work(&dim->work);
> >>                  return;
> >> +       }
> > Patch looks fine but I wonder if a delayed schedule is better.
>
> The work in net_dim() core layer uses non-delayed-work, and the two
> cannot be mixed.

Well, I think we need first to figure out if delayed work is better here.

Switching to use delayed work for dim seems not hard anyhow.

Thanks

>
> Thanks,
> Heng
>
> >
> > Thanks
> >
> >>          /* Each rxq's work is queued by "net_dim()->schedule_work()"
> >>           * in response to NAPI traffic changes. Note that dim->profile_ix
> >> --
> >> 1.8.3.1
> >>
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] virtio-net: fix possible dim status unrecoverable
  2024-03-25  6:29       ` Jason Wang
@ 2024-03-25  6:57         ` Heng Qi
  2024-03-25  7:06           ` Jason Wang
  0 siblings, 1 reply; 24+ messages in thread
From: Heng Qi @ 2024-03-25  6:57 UTC (permalink / raw)
  To: Jason Wang
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo



在 2024/3/25 下午2:29, Jason Wang 写道:
> On Mon, Mar 25, 2024 at 10:11 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>
>>
>> 在 2024/3/22 下午1:17, Jason Wang 写道:
>>> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>> When the dim worker is scheduled, if it fails to acquire the lock,
>>>> dim may not be able to return to the working state later.
>>>>
>>>> For example, the following single queue scenario:
>>>>     1. The dim worker of rxq0 is scheduled, and the dim status is
>>>>        changed to DIM_APPLY_NEW_PROFILE;
>>>>     2. The ethtool command is holding rtnl lock;
>>>>     3. Since the rtnl lock is already held, virtnet_rx_dim_work fails
>>>>        to acquire the lock and exits;
>>>>
>>>> Then, even if net_dim is invoked again, it cannot work because the
>>>> state is not restored to DIM_START_MEASURE.
>>>>
>>>> Fixes: 6208799553a8 ("virtio-net: support rx netdim")
>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>>>> ---
>>>>    drivers/net/virtio_net.c | 4 +++-
>>>>    1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>> index c22d111..0ebe322 100644
>>>> --- a/drivers/net/virtio_net.c
>>>> +++ b/drivers/net/virtio_net.c
>>>> @@ -3563,8 +3563,10 @@ static void virtnet_rx_dim_work(struct work_struct *work)
>>>>           struct dim_cq_moder update_moder;
>>>>           int i, qnum, err;
>>>>
>>>> -       if (!rtnl_trylock())
>>>> +       if (!rtnl_trylock()) {
>>>> +               schedule_work(&dim->work);
>>>>                   return;
>>>> +       }
>>> Patch looks fine but I wonder if a delayed schedule is better.
>> The work in net_dim() core layer uses non-delayed-work, and the two
>> cannot be mixed.
> Well, I think we need first to figure out if delayed work is better here.

I tested a VM with 16 NICs, 128 queues per NIC (2kq total). With dim 
enabled on all queues,
there are many opportunities for contention for rtnl lock, and this 
patch introduces no visible hotspots.
The dim performance is also stable. So I think there doesn't seem to be 
a strong motivation right now.

Thanks,
Heng

>
> Switching to use delayed work for dim seems not hard anyhow.
>
> Thanks
>
>> Thanks,
>> Heng
>>
>>> Thanks
>>>
>>>>           /* Each rxq's work is queued by "net_dim()->schedule_work()"
>>>>            * in response to NAPI traffic changes. Note that dim->profile_ix
>>>> --
>>>> 1.8.3.1
>>>>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] virtio-net: fix possible dim status unrecoverable
  2024-03-25  6:57         ` Heng Qi
@ 2024-03-25  7:06           ` Jason Wang
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Wang @ 2024-03-25  7:06 UTC (permalink / raw)
  To: Heng Qi
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo

On Mon, Mar 25, 2024 at 2:58 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
>
>
> 在 2024/3/25 下午2:29, Jason Wang 写道:
> > On Mon, Mar 25, 2024 at 10:11 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>
> >>
> >> 在 2024/3/22 下午1:17, Jason Wang 写道:
> >>> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>> When the dim worker is scheduled, if it fails to acquire the lock,
> >>>> dim may not be able to return to the working state later.
> >>>>
> >>>> For example, the following single queue scenario:
> >>>>     1. The dim worker of rxq0 is scheduled, and the dim status is
> >>>>        changed to DIM_APPLY_NEW_PROFILE;
> >>>>     2. The ethtool command is holding rtnl lock;
> >>>>     3. Since the rtnl lock is already held, virtnet_rx_dim_work fails
> >>>>        to acquire the lock and exits;
> >>>>
> >>>> Then, even if net_dim is invoked again, it cannot work because the
> >>>> state is not restored to DIM_START_MEASURE.
> >>>>
> >>>> Fixes: 6208799553a8 ("virtio-net: support rx netdim")
> >>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> >>>> ---
> >>>>    drivers/net/virtio_net.c | 4 +++-
> >>>>    1 file changed, 3 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >>>> index c22d111..0ebe322 100644
> >>>> --- a/drivers/net/virtio_net.c
> >>>> +++ b/drivers/net/virtio_net.c
> >>>> @@ -3563,8 +3563,10 @@ static void virtnet_rx_dim_work(struct work_struct *work)
> >>>>           struct dim_cq_moder update_moder;
> >>>>           int i, qnum, err;
> >>>>
> >>>> -       if (!rtnl_trylock())
> >>>> +       if (!rtnl_trylock()) {
> >>>> +               schedule_work(&dim->work);
> >>>>                   return;
> >>>> +       }
> >>> Patch looks fine but I wonder if a delayed schedule is better.
> >> The work in net_dim() core layer uses non-delayed-work, and the two
> >> cannot be mixed.
> > Well, I think we need first to figure out if delayed work is better here.
>
> I tested a VM with 16 NICs, 128 queues per NIC (2kq total). With dim
> enabled on all queues,
> there are many opportunities for contention for rtnl lock, and this
> patch introduces no visible hotspots.
> The dim performance is also stable. So I think there doesn't seem to be
> a strong motivation right now.

That's fine, let's add them to the changelog.

Acked-by: Jason Wang <jasowang@redhat.com>

Thanks

>
> Thanks,
> Heng
>
> >
> > Switching to use delayed work for dim seems not hard anyhow.
> >
> > Thanks
> >
> >> Thanks,
> >> Heng
> >>
> >>> Thanks
> >>>
> >>>>           /* Each rxq's work is queued by "net_dim()->schedule_work()"
> >>>>            * in response to NAPI traffic changes. Note that dim->profile_ix
> >>>> --
> >>>> 1.8.3.1
> >>>>
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-25  5:57       ` Jason Wang
@ 2024-03-25  7:17         ` Heng Qi
  2024-03-25  7:56           ` Jason Wang
  0 siblings, 1 reply; 24+ messages in thread
From: Heng Qi @ 2024-03-25  7:17 UTC (permalink / raw)
  To: Jason Wang
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo



在 2024/3/25 下午1:57, Jason Wang 写道:
> On Mon, Mar 25, 2024 at 10:21 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>
>>
>> 在 2024/3/22 下午1:19, Jason Wang 写道:
>>> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>> Currently, ctrlq processes commands in a synchronous manner,
>>>> which increases the delay of dim commands when configuring
>>>> multi-queue VMs, which in turn causes the CPU utilization to
>>>> increase and interferes with the performance of dim.
>>>>
>>>> Therefore we asynchronously process ctlq's dim commands.
>>>>
>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>>> I may miss some previous discussions.
>>>
>>> But at least the changelog needs to explain why you don't use interrupt.
>> Will add, but reply here first.
>>
>> When upgrading the driver's ctrlq to use interrupt, problems may occur
>> with some existing devices.
>> For example, when existing devices are replaced with new drivers, they
>> may not work.
>> Or, if the guest OS supported by the new device is replaced by an old
>> downstream OS product, it will not be usable.
>>
>> Although, ctrlq has the same capabilities as IOq in the virtio spec,
>> this does have historical baggage.
> I don't think the upstream Linux drivers need to workaround buggy
> devices. Or it is a good excuse to block configure interrupts.

Of course I agree. Our DPU devices support ctrlq irq natively, as long 
as the guest os opens irq to ctrlq.

If other products have no problem with this, I would prefer to use irq 
to solve this problem, which is the most essential solution.

>
> And I remember you told us your device doesn't have such an issue.

YES.

Thanks,
Heng

>
> Thanks
>
>> Thanks,
>> Heng
>>
>>> Thanks


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-25  7:17         ` Heng Qi
@ 2024-03-25  7:56           ` Jason Wang
  2024-03-25  8:22             ` Heng Qi
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Wang @ 2024-03-25  7:56 UTC (permalink / raw)
  To: Heng Qi
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo

On Mon, Mar 25, 2024 at 3:18 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
>
>
> 在 2024/3/25 下午1:57, Jason Wang 写道:
> > On Mon, Mar 25, 2024 at 10:21 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>
> >>
> >> 在 2024/3/22 下午1:19, Jason Wang 写道:
> >>> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>> Currently, ctrlq processes commands in a synchronous manner,
> >>>> which increases the delay of dim commands when configuring
> >>>> multi-queue VMs, which in turn causes the CPU utilization to
> >>>> increase and interferes with the performance of dim.
> >>>>
> >>>> Therefore we asynchronously process ctlq's dim commands.
> >>>>
> >>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> >>> I may miss some previous discussions.
> >>>
> >>> But at least the changelog needs to explain why you don't use interrupt.
> >> Will add, but reply here first.
> >>
> >> When upgrading the driver's ctrlq to use interrupt, problems may occur
> >> with some existing devices.
> >> For example, when existing devices are replaced with new drivers, they
> >> may not work.
> >> Or, if the guest OS supported by the new device is replaced by an old
> >> downstream OS product, it will not be usable.
> >>
> >> Although, ctrlq has the same capabilities as IOq in the virtio spec,
> >> this does have historical baggage.
> > I don't think the upstream Linux drivers need to workaround buggy
> > devices. Or it is a good excuse to block configure interrupts.
>
> Of course I agree. Our DPU devices support ctrlq irq natively, as long
> as the guest os opens irq to ctrlq.
>
> If other products have no problem with this, I would prefer to use irq
> to solve this problem, which is the most essential solution.

Let's do that.

Thanks

>
> >
> > And I remember you told us your device doesn't have such an issue.
>
> YES.
>
> Thanks,
> Heng
>
> >
> > Thanks
> >
> >> Thanks,
> >> Heng
> >>
> >>> Thanks
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-25  7:56           ` Jason Wang
@ 2024-03-25  8:22             ` Heng Qi
  2024-03-25  8:42               ` Jason Wang
  0 siblings, 1 reply; 24+ messages in thread
From: Heng Qi @ 2024-03-25  8:22 UTC (permalink / raw)
  To: Jason Wang
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo



在 2024/3/25 下午3:56, Jason Wang 写道:
> On Mon, Mar 25, 2024 at 3:18 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>
>>
>> 在 2024/3/25 下午1:57, Jason Wang 写道:
>>> On Mon, Mar 25, 2024 at 10:21 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>>
>>>> 在 2024/3/22 下午1:19, Jason Wang 写道:
>>>>> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>>>> Currently, ctrlq processes commands in a synchronous manner,
>>>>>> which increases the delay of dim commands when configuring
>>>>>> multi-queue VMs, which in turn causes the CPU utilization to
>>>>>> increase and interferes with the performance of dim.
>>>>>>
>>>>>> Therefore we asynchronously process ctlq's dim commands.
>>>>>>
>>>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>>>>> I may miss some previous discussions.
>>>>>
>>>>> But at least the changelog needs to explain why you don't use interrupt.
>>>> Will add, but reply here first.
>>>>
>>>> When upgrading the driver's ctrlq to use interrupt, problems may occur
>>>> with some existing devices.
>>>> For example, when existing devices are replaced with new drivers, they
>>>> may not work.
>>>> Or, if the guest OS supported by the new device is replaced by an old
>>>> downstream OS product, it will not be usable.
>>>>
>>>> Although, ctrlq has the same capabilities as IOq in the virtio spec,
>>>> this does have historical baggage.
>>> I don't think the upstream Linux drivers need to workaround buggy
>>> devices. Or it is a good excuse to block configure interrupts.
>> Of course I agree. Our DPU devices support ctrlq irq natively, as long
>> as the guest os opens irq to ctrlq.
>>
>> If other products have no problem with this, I would prefer to use irq
>> to solve this problem, which is the most essential solution.
> Let's do that.

Ok, will do.

Do you have the link to the patch where you previously modified the 
control queue for interrupt notifications.
I think a new patch could be made on top of it, but I can't seem to find it.

Thanks,
Heng

>
> Thanks
>
>>> And I remember you told us your device doesn't have such an issue.
>> YES.
>>
>> Thanks,
>> Heng
>>
>>> Thanks
>>>
>>>> Thanks,
>>>> Heng
>>>>
>>>>> Thanks


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-25  8:22             ` Heng Qi
@ 2024-03-25  8:42               ` Jason Wang
  2024-03-26  2:46                 ` Heng Qi
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Wang @ 2024-03-25  8:42 UTC (permalink / raw)
  To: Heng Qi
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo

On Mon, Mar 25, 2024 at 4:22 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
>
>
> 在 2024/3/25 下午3:56, Jason Wang 写道:
> > On Mon, Mar 25, 2024 at 3:18 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>
> >>
> >> 在 2024/3/25 下午1:57, Jason Wang 写道:
> >>> On Mon, Mar 25, 2024 at 10:21 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>>
> >>>> 在 2024/3/22 下午1:19, Jason Wang 写道:
> >>>>> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>>>> Currently, ctrlq processes commands in a synchronous manner,
> >>>>>> which increases the delay of dim commands when configuring
> >>>>>> multi-queue VMs, which in turn causes the CPU utilization to
> >>>>>> increase and interferes with the performance of dim.
> >>>>>>
> >>>>>> Therefore we asynchronously process ctlq's dim commands.
> >>>>>>
> >>>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> >>>>> I may miss some previous discussions.
> >>>>>
> >>>>> But at least the changelog needs to explain why you don't use interrupt.
> >>>> Will add, but reply here first.
> >>>>
> >>>> When upgrading the driver's ctrlq to use interrupt, problems may occur
> >>>> with some existing devices.
> >>>> For example, when existing devices are replaced with new drivers, they
> >>>> may not work.
> >>>> Or, if the guest OS supported by the new device is replaced by an old
> >>>> downstream OS product, it will not be usable.
> >>>>
> >>>> Although, ctrlq has the same capabilities as IOq in the virtio spec,
> >>>> this does have historical baggage.
> >>> I don't think the upstream Linux drivers need to workaround buggy
> >>> devices. Or it is a good excuse to block configure interrupts.
> >> Of course I agree. Our DPU devices support ctrlq irq natively, as long
> >> as the guest os opens irq to ctrlq.
> >>
> >> If other products have no problem with this, I would prefer to use irq
> >> to solve this problem, which is the most essential solution.
> > Let's do that.
>
> Ok, will do.
>
> Do you have the link to the patch where you previously modified the
> control queue for interrupt notifications.
> I think a new patch could be made on top of it, but I can't seem to find it.

Something like this?

https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/

Note that

1) some patch has been merged
2) we probably need to drop the timeout logic as it's another topic
3) need to address other comments

THanks


>
> Thanks,
> Heng
>
> >
> > Thanks
> >
> >>> And I remember you told us your device doesn't have such an issue.
> >> YES.
> >>
> >> Thanks,
> >> Heng
> >>
> >>> Thanks
> >>>
> >>>> Thanks,
> >>>> Heng
> >>>>
> >>>>> Thanks
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-25  8:42               ` Jason Wang
@ 2024-03-26  2:46                 ` Heng Qi
  2024-03-26  4:08                   ` Jason Wang
  0 siblings, 1 reply; 24+ messages in thread
From: Heng Qi @ 2024-03-26  2:46 UTC (permalink / raw)
  To: Jason Wang
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo



在 2024/3/25 下午4:42, Jason Wang 写道:
> On Mon, Mar 25, 2024 at 4:22 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>
>>
>> 在 2024/3/25 下午3:56, Jason Wang 写道:
>>> On Mon, Mar 25, 2024 at 3:18 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>>
>>>> 在 2024/3/25 下午1:57, Jason Wang 写道:
>>>>> On Mon, Mar 25, 2024 at 10:21 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>>>> 在 2024/3/22 下午1:19, Jason Wang 写道:
>>>>>>> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>>>>>> Currently, ctrlq processes commands in a synchronous manner,
>>>>>>>> which increases the delay of dim commands when configuring
>>>>>>>> multi-queue VMs, which in turn causes the CPU utilization to
>>>>>>>> increase and interferes with the performance of dim.
>>>>>>>>
>>>>>>>> Therefore we asynchronously process ctlq's dim commands.
>>>>>>>>
>>>>>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>>>>>>> I may miss some previous discussions.
>>>>>>>
>>>>>>> But at least the changelog needs to explain why you don't use interrupt.
>>>>>> Will add, but reply here first.
>>>>>>
>>>>>> When upgrading the driver's ctrlq to use interrupt, problems may occur
>>>>>> with some existing devices.
>>>>>> For example, when existing devices are replaced with new drivers, they
>>>>>> may not work.
>>>>>> Or, if the guest OS supported by the new device is replaced by an old
>>>>>> downstream OS product, it will not be usable.
>>>>>>
>>>>>> Although, ctrlq has the same capabilities as IOq in the virtio spec,
>>>>>> this does have historical baggage.
>>>>> I don't think the upstream Linux drivers need to workaround buggy
>>>>> devices. Or it is a good excuse to block configure interrupts.
>>>> Of course I agree. Our DPU devices support ctrlq irq natively, as long
>>>> as the guest os opens irq to ctrlq.
>>>>
>>>> If other products have no problem with this, I would prefer to use irq
>>>> to solve this problem, which is the most essential solution.
>>> Let's do that.
>> Ok, will do.
>>
>> Do you have the link to the patch where you previously modified the
>> control queue for interrupt notifications.
>> I think a new patch could be made on top of it, but I can't seem to find it.
> Something like this?

YES. Thanks Jason.

>
> https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
>
> Note that
>
> 1) some patch has been merged
> 2) we probably need to drop the timeout logic as it's another topic
> 3) need to address other comments

I did a quick read of your patch sets from the previous 5 version:
[1] 
https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
[2] https://lore.kernel.org/all/20221226074908.8154-1-jasowang@redhat.com/
[3] https://lore.kernel.org/all/20230413064027.13267-1-jasowang@redhat.com/
[4] https://lore.kernel.org/all/20230524081842.3060-1-jasowang@redhat.com/
[5] https://lore.kernel.org/all/20230720083839.481487-1-jasowang@redhat.com/

Regarding adding the interrupt to ctrlq, there are a few points where 
there is no agreement,
which I summarize below.

1. Require additional interrupt vector resource
https://lore.kernel.org/all/20230516165043-mutt-send-email-mst@kernel.org/
2. Adding the interrupt for ctrlq may break some devices
https://lore.kernel.org/all/f9e75ce5-e6df-d1be-201b-7d0f18c1b6e7@redhat.com/
3. RTNL breaks surprise removal
https://lore.kernel.org/all/20230720170001-mutt-send-email-mst@kernel.org/

Regarding the above, there seems to be no conclusion yet.
If these problems still exist, I think this patch is good enough and we 
can merge it first.

For the third point, it seems to be being solved by Daniel now [6], but 
spink lock is used,
which I think conflicts with the way of adding interrupts to ctrlq.

[6] https://lore.kernel.org/all/20240325214912.323749-1-danielj@nvidia.com/


Thanks,
Heng

>
> THanks
>
>
>> Thanks,
>> Heng
>>
>>> Thanks
>>>
>>>>> And I remember you told us your device doesn't have such an issue.
>>>> YES.
>>>>
>>>> Thanks,
>>>> Heng
>>>>
>>>>> Thanks
>>>>>
>>>>>> Thanks,
>>>>>> Heng
>>>>>>
>>>>>>> Thanks


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-26  2:46                 ` Heng Qi
@ 2024-03-26  4:08                   ` Jason Wang
  2024-03-26  5:57                     ` Heng Qi
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Wang @ 2024-03-26  4:08 UTC (permalink / raw)
  To: Heng Qi
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo

On Tue, Mar 26, 2024 at 10:46 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
>
>
> 在 2024/3/25 下午4:42, Jason Wang 写道:
> > On Mon, Mar 25, 2024 at 4:22 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>
> >>
> >> 在 2024/3/25 下午3:56, Jason Wang 写道:
> >>> On Mon, Mar 25, 2024 at 3:18 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>>
> >>>> 在 2024/3/25 下午1:57, Jason Wang 写道:
> >>>>> On Mon, Mar 25, 2024 at 10:21 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>>>> 在 2024/3/22 下午1:19, Jason Wang 写道:
> >>>>>>> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>>>>>> Currently, ctrlq processes commands in a synchronous manner,
> >>>>>>>> which increases the delay of dim commands when configuring
> >>>>>>>> multi-queue VMs, which in turn causes the CPU utilization to
> >>>>>>>> increase and interferes with the performance of dim.
> >>>>>>>>
> >>>>>>>> Therefore we asynchronously process ctlq's dim commands.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> >>>>>>> I may miss some previous discussions.
> >>>>>>>
> >>>>>>> But at least the changelog needs to explain why you don't use interrupt.
> >>>>>> Will add, but reply here first.
> >>>>>>
> >>>>>> When upgrading the driver's ctrlq to use interrupt, problems may occur
> >>>>>> with some existing devices.
> >>>>>> For example, when existing devices are replaced with new drivers, they
> >>>>>> may not work.
> >>>>>> Or, if the guest OS supported by the new device is replaced by an old
> >>>>>> downstream OS product, it will not be usable.
> >>>>>>
> >>>>>> Although, ctrlq has the same capabilities as IOq in the virtio spec,
> >>>>>> this does have historical baggage.
> >>>>> I don't think the upstream Linux drivers need to workaround buggy
> >>>>> devices. Or it is a good excuse to block configure interrupts.
> >>>> Of course I agree. Our DPU devices support ctrlq irq natively, as long
> >>>> as the guest os opens irq to ctrlq.
> >>>>
> >>>> If other products have no problem with this, I would prefer to use irq
> >>>> to solve this problem, which is the most essential solution.
> >>> Let's do that.
> >> Ok, will do.
> >>
> >> Do you have the link to the patch where you previously modified the
> >> control queue for interrupt notifications.
> >> I think a new patch could be made on top of it, but I can't seem to find it.
> > Something like this?
>
> YES. Thanks Jason.
>
> >
> > https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
> >
> > Note that
> >
> > 1) some patch has been merged
> > 2) we probably need to drop the timeout logic as it's another topic
> > 3) need to address other comments
>
> I did a quick read of your patch sets from the previous 5 version:
> [1]
> https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
> [2] https://lore.kernel.org/all/20221226074908.8154-1-jasowang@redhat.com/
> [3] https://lore.kernel.org/all/20230413064027.13267-1-jasowang@redhat.com/
> [4] https://lore.kernel.org/all/20230524081842.3060-1-jasowang@redhat.com/
> [5] https://lore.kernel.org/all/20230720083839.481487-1-jasowang@redhat.com/
>
> Regarding adding the interrupt to ctrlq, there are a few points where
> there is no agreement,
> which I summarize below.
>
> 1. Require additional interrupt vector resource
> https://lore.kernel.org/all/20230516165043-mutt-send-email-mst@kernel.org/

I don't think one more vector is a big problem. Multiqueue will
require much more than this.

Even if it is, we can try to share an interrupt as Michael suggests.

Let's start from something that is simple, just one more vector.

> 2. Adding the interrupt for ctrlq may break some devices
> https://lore.kernel.org/all/f9e75ce5-e6df-d1be-201b-7d0f18c1b6e7@redhat.com/

These devices need to be fixed. It's hard to imagine the evolution of
virtio-net is blocked by buggy devices.

> 3. RTNL breaks surprise removal
> https://lore.kernel.org/all/20230720170001-mutt-send-email-mst@kernel.org/

The comment is for indefinite waiting for ctrl vq which turns out to
be another issue.

For the removal, we just need to do the wakeup then everything is fine.

>
> Regarding the above, there seems to be no conclusion yet.
> If these problems still exist, I think this patch is good enough and we
> can merge it first.

I don't think so, poll turns out to be problematic for a lot of cases.

>
> For the third point, it seems to be being solved by Daniel now [6], but
> spink lock is used,
> which I think conflicts with the way of adding interrupts to ctrlq.
>
> [6] https://lore.kernel.org/all/20240325214912.323749-1-danielj@nvidia.com/

I don't see how it conflicts with this.

Thanks

>
>
> Thanks,
> Heng
>
> >
> > THanks
> >
> >
> >> Thanks,
> >> Heng
> >>
> >>> Thanks
> >>>
> >>>>> And I remember you told us your device doesn't have such an issue.
> >>>> YES.
> >>>>
> >>>> Thanks,
> >>>> Heng
> >>>>
> >>>>> Thanks
> >>>>>
> >>>>>> Thanks,
> >>>>>> Heng
> >>>>>>
> >>>>>>> Thanks
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-26  4:08                   ` Jason Wang
@ 2024-03-26  5:57                     ` Heng Qi
  2024-03-26  6:05                       ` Jason Wang
  0 siblings, 1 reply; 24+ messages in thread
From: Heng Qi @ 2024-03-26  5:57 UTC (permalink / raw)
  To: Jason Wang
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo



在 2024/3/26 下午12:08, Jason Wang 写道:
> On Tue, Mar 26, 2024 at 10:46 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>
>>
>> 在 2024/3/25 下午4:42, Jason Wang 写道:
>>> On Mon, Mar 25, 2024 at 4:22 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>>
>>>> 在 2024/3/25 下午3:56, Jason Wang 写道:
>>>>> On Mon, Mar 25, 2024 at 3:18 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>>>> 在 2024/3/25 下午1:57, Jason Wang 写道:
>>>>>>> On Mon, Mar 25, 2024 at 10:21 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>>>>>> 在 2024/3/22 下午1:19, Jason Wang 写道:
>>>>>>>>> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>>>>>>>> Currently, ctrlq processes commands in a synchronous manner,
>>>>>>>>>> which increases the delay of dim commands when configuring
>>>>>>>>>> multi-queue VMs, which in turn causes the CPU utilization to
>>>>>>>>>> increase and interferes with the performance of dim.
>>>>>>>>>>
>>>>>>>>>> Therefore we asynchronously process ctlq's dim commands.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>>>>>>>>> I may miss some previous discussions.
>>>>>>>>>
>>>>>>>>> But at least the changelog needs to explain why you don't use interrupt.
>>>>>>>> Will add, but reply here first.
>>>>>>>>
>>>>>>>> When upgrading the driver's ctrlq to use interrupt, problems may occur
>>>>>>>> with some existing devices.
>>>>>>>> For example, when existing devices are replaced with new drivers, they
>>>>>>>> may not work.
>>>>>>>> Or, if the guest OS supported by the new device is replaced by an old
>>>>>>>> downstream OS product, it will not be usable.
>>>>>>>>
>>>>>>>> Although, ctrlq has the same capabilities as IOq in the virtio spec,
>>>>>>>> this does have historical baggage.
>>>>>>> I don't think the upstream Linux drivers need to workaround buggy
>>>>>>> devices. Or it is a good excuse to block configure interrupts.
>>>>>> Of course I agree. Our DPU devices support ctrlq irq natively, as long
>>>>>> as the guest os opens irq to ctrlq.
>>>>>>
>>>>>> If other products have no problem with this, I would prefer to use irq
>>>>>> to solve this problem, which is the most essential solution.
>>>>> Let's do that.
>>>> Ok, will do.
>>>>
>>>> Do you have the link to the patch where you previously modified the
>>>> control queue for interrupt notifications.
>>>> I think a new patch could be made on top of it, but I can't seem to find it.
>>> Something like this?
>> YES. Thanks Jason.
>>
>>> https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
>>>
>>> Note that
>>>
>>> 1) some patch has been merged
>>> 2) we probably need to drop the timeout logic as it's another topic
>>> 3) need to address other comments
>> I did a quick read of your patch sets from the previous 5 version:
>> [1]
>> https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
>> [2] https://lore.kernel.org/all/20221226074908.8154-1-jasowang@redhat.com/
>> [3] https://lore.kernel.org/all/20230413064027.13267-1-jasowang@redhat.com/
>> [4] https://lore.kernel.org/all/20230524081842.3060-1-jasowang@redhat.com/
>> [5] https://lore.kernel.org/all/20230720083839.481487-1-jasowang@redhat.com/
>>
>> Regarding adding the interrupt to ctrlq, there are a few points where
>> there is no agreement,
>> which I summarize below.
>>
>> 1. Require additional interrupt vector resource
>> https://lore.kernel.org/all/20230516165043-mutt-send-email-mst@kernel.org/
> I don't think one more vector is a big problem. Multiqueue will
> require much more than this.
>
> Even if it is, we can try to share an interrupt as Michael suggests.
>
> Let's start from something that is simple, just one more vector.

OK, that puts my concerns to rest.

>
>> 2. Adding the interrupt for ctrlq may break some devices
>> https://lore.kernel.org/all/f9e75ce5-e6df-d1be-201b-7d0f18c1b6e7@redhat.com/
> These devices need to be fixed. It's hard to imagine the evolution of
> virtio-net is blocked by buggy devices.

Agree.

>
>> 3. RTNL breaks surprise removal
>> https://lore.kernel.org/all/20230720170001-mutt-send-email-mst@kernel.org/
> The comment is for indefinite waiting for ctrl vq which turns out to
> be another issue.
>
> For the removal, we just need to do the wakeup then everything is fine.

Then I will make a patch set based on irq and without timeout.

>
>> Regarding the above, there seems to be no conclusion yet.
>> If these problems still exist, I think this patch is good enough and we
>> can merge it first.
> I don't think so, poll turns out to be problematic for a lot of cases.
>
>> For the third point, it seems to be being solved by Daniel now [6], but
>> spink lock is used,
>> which I think conflicts with the way of adding interrupts to ctrlq.
>>
>> [6] https://lore.kernel.org/all/20240325214912.323749-1-danielj@nvidia.com/
> I don't see how it conflicts with this.

I'll just make changes on top of it. Can I?

Thanks,
Heng

>
> Thanks
>
>>
>> Thanks,
>> Heng
>>
>>> THanks
>>>
>>>
>>>> Thanks,
>>>> Heng
>>>>
>>>>> Thanks
>>>>>
>>>>>>> And I remember you told us your device doesn't have such an issue.
>>>>>> YES.
>>>>>>
>>>>>> Thanks,
>>>>>> Heng
>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Heng
>>>>>>>>
>>>>>>>>> Thanks


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
  2024-03-26  5:57                     ` Heng Qi
@ 2024-03-26  6:05                       ` Jason Wang
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Wang @ 2024-03-26  6:05 UTC (permalink / raw)
  To: Heng Qi
  Cc: netdev, virtualization, Michael S. Tsirkin, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, David S. Miller, Xuan Zhuo

On Tue, Mar 26, 2024 at 1:57 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
>
>
> 在 2024/3/26 下午12:08, Jason Wang 写道:
> > On Tue, Mar 26, 2024 at 10:46 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>
> >>
> >> 在 2024/3/25 下午4:42, Jason Wang 写道:
> >>> On Mon, Mar 25, 2024 at 4:22 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>>
> >>>> 在 2024/3/25 下午3:56, Jason Wang 写道:
> >>>>> On Mon, Mar 25, 2024 at 3:18 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>>>> 在 2024/3/25 下午1:57, Jason Wang 写道:
> >>>>>>> On Mon, Mar 25, 2024 at 10:21 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>>>>>> 在 2024/3/22 下午1:19, Jason Wang 写道:
> >>>>>>>>> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>>>>>>>> Currently, ctrlq processes commands in a synchronous manner,
> >>>>>>>>>> which increases the delay of dim commands when configuring
> >>>>>>>>>> multi-queue VMs, which in turn causes the CPU utilization to
> >>>>>>>>>> increase and interferes with the performance of dim.
> >>>>>>>>>>
> >>>>>>>>>> Therefore we asynchronously process ctlq's dim commands.
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> >>>>>>>>> I may miss some previous discussions.
> >>>>>>>>>
> >>>>>>>>> But at least the changelog needs to explain why you don't use interrupt.
> >>>>>>>> Will add, but reply here first.
> >>>>>>>>
> >>>>>>>> When upgrading the driver's ctrlq to use interrupt, problems may occur
> >>>>>>>> with some existing devices.
> >>>>>>>> For example, when existing devices are replaced with new drivers, they
> >>>>>>>> may not work.
> >>>>>>>> Or, if the guest OS supported by the new device is replaced by an old
> >>>>>>>> downstream OS product, it will not be usable.
> >>>>>>>>
> >>>>>>>> Although, ctrlq has the same capabilities as IOq in the virtio spec,
> >>>>>>>> this does have historical baggage.
> >>>>>>> I don't think the upstream Linux drivers need to workaround buggy
> >>>>>>> devices. Or it is a good excuse to block configure interrupts.
> >>>>>> Of course I agree. Our DPU devices support ctrlq irq natively, as long
> >>>>>> as the guest os opens irq to ctrlq.
> >>>>>>
> >>>>>> If other products have no problem with this, I would prefer to use irq
> >>>>>> to solve this problem, which is the most essential solution.
> >>>>> Let's do that.
> >>>> Ok, will do.
> >>>>
> >>>> Do you have the link to the patch where you previously modified the
> >>>> control queue for interrupt notifications.
> >>>> I think a new patch could be made on top of it, but I can't seem to find it.
> >>> Something like this?
> >> YES. Thanks Jason.
> >>
> >>> https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
> >>>
> >>> Note that
> >>>
> >>> 1) some patch has been merged
> >>> 2) we probably need to drop the timeout logic as it's another topic
> >>> 3) need to address other comments
> >> I did a quick read of your patch sets from the previous 5 version:
> >> [1]
> >> https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
> >> [2] https://lore.kernel.org/all/20221226074908.8154-1-jasowang@redhat.com/
> >> [3] https://lore.kernel.org/all/20230413064027.13267-1-jasowang@redhat.com/
> >> [4] https://lore.kernel.org/all/20230524081842.3060-1-jasowang@redhat.com/
> >> [5] https://lore.kernel.org/all/20230720083839.481487-1-jasowang@redhat.com/
> >>
> >> Regarding adding the interrupt to ctrlq, there are a few points where
> >> there is no agreement,
> >> which I summarize below.
> >>
> >> 1. Require additional interrupt vector resource
> >> https://lore.kernel.org/all/20230516165043-mutt-send-email-mst@kernel.org/
> > I don't think one more vector is a big problem. Multiqueue will
> > require much more than this.
> >
> > Even if it is, we can try to share an interrupt as Michael suggests.
> >
> > Let's start from something that is simple, just one more vector.
>
> OK, that puts my concerns to rest.
>
> >
> >> 2. Adding the interrupt for ctrlq may break some devices
> >> https://lore.kernel.org/all/f9e75ce5-e6df-d1be-201b-7d0f18c1b6e7@redhat.com/
> > These devices need to be fixed. It's hard to imagine the evolution of
> > virtio-net is blocked by buggy devices.
>
> Agree.
>
> >
> >> 3. RTNL breaks surprise removal
> >> https://lore.kernel.org/all/20230720170001-mutt-send-email-mst@kernel.org/
> > The comment is for indefinite waiting for ctrl vq which turns out to
> > be another issue.
> >
> > For the removal, we just need to do the wakeup then everything is fine.
>
> Then I will make a patch set based on irq and without timeout.

Ok.

>
> >
> >> Regarding the above, there seems to be no conclusion yet.
> >> If these problems still exist, I think this patch is good enough and we
> >> can merge it first.
> > I don't think so, poll turns out to be problematic for a lot of cases.
> >
> >> For the third point, it seems to be being solved by Daniel now [6], but
> >> spink lock is used,
> >> which I think conflicts with the way of adding interrupts to ctrlq.
> >>
> >> [6] https://lore.kernel.org/all/20240325214912.323749-1-danielj@nvidia.com/
> > I don't see how it conflicts with this.
>
> I'll just make changes on top of it. Can I?

It should be fine.

Thanks

>
> Thanks,
> Heng
>
> >
> > Thanks
> >
> >>
> >> Thanks,
> >> Heng
> >>
> >>> THanks
> >>>
> >>>
> >>>> Thanks,
> >>>> Heng
> >>>>
> >>>>> Thanks
> >>>>>
> >>>>>>> And I remember you told us your device doesn't have such an issue.
> >>>>>> YES.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Heng
> >>>>>>
> >>>>>>> Thanks
> >>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Heng
> >>>>>>>>
> >>>>>>>>> Thanks
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
@ 2024-03-22  4:18 kernel test robot
  0 siblings, 0 replies; 24+ messages in thread
From: kernel test robot @ 2024-03-22  4:18 UTC (permalink / raw)
  To: oe-kbuild; +Cc: lkp, Dan Carpenter

BCC: lkp@intel.com
CC: oe-kbuild-all@lists.linux.dev
In-Reply-To: <1711021557-58116-3-git-send-email-hengqi@linux.alibaba.com>
References: <1711021557-58116-3-git-send-email-hengqi@linux.alibaba.com>
TO: Heng Qi <hengqi@linux.alibaba.com>
TO: netdev@vger.kernel.org
TO: virtualization@lists.linux.dev
TO: Jason Wang <jasowang@redhat.com>
TO: "Michael S. Tsirkin" <mst@redhat.com>
TO: Jakub Kicinski <kuba@kernel.org>
TO: Paolo Abeni <pabeni@redhat.com>
TO: Eric Dumazet <edumazet@google.com>
TO: "David S. Miller" <davem@davemloft.net>
TO: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

Hi Heng,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on next-20240321]
[cannot apply to v6.8]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Heng-Qi/virtio-net-fix-possible-dim-status-unrecoverable/20240321-194759
base:   linus/master
patch link:    https://lore.kernel.org/r/1711021557-58116-3-git-send-email-hengqi%40linux.alibaba.com
patch subject: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
:::::: branch date: 16 hours ago
:::::: commit date: 16 hours ago
config: i386-randconfig-141-20240322 (https://download.01.org/0day-ci/archive/20240322/202403221133.J66oueZh-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <error27@gmail.com>
| Closes: https://lore.kernel.org/r/202403221133.J66oueZh-lkp@intel.com/

New smatch warnings:
drivers/net/virtio_net.c:5031 virtnet_probe() warn: missing error code 'err'

Old smatch warnings:
drivers/net/virtio_net.c:1438 build_skb_from_xdp_buff() error: uninitialized symbol 'nr_frags'.
drivers/net/virtio_net.c:1520 virtnet_build_xdp_buff_mrg() error: uninitialized symbol 'shinfo'.
drivers/net/virtio_net.c:3943 virtnet_update_settings() error: uninitialized symbol 'virtio_cread_v'.
drivers/net/virtio_net.c:4371 virtnet_config_changed_work() error: uninitialized symbol 'virtio_cread_v'.
drivers/net/virtio_net.c:4375 virtnet_config_changed_work() error: uninitialized symbol 'v'.

vim +/err +5031 drivers/net/virtio_net.c

d8cd72f1622753 Heng Qi               2024-03-21  4831  
fe36cbe0671e86 Michael S. Tsirkin    2017-03-29  4832  static int virtnet_probe(struct virtio_device *vdev)
fe36cbe0671e86 Michael S. Tsirkin    2017-03-29  4833  {
d7dfc5cf56b0e3 Toshiaki Makita       2018-01-17  4834  	int i, err = -ENOMEM;
fe36cbe0671e86 Michael S. Tsirkin    2017-03-29  4835  	struct net_device *dev;
fe36cbe0671e86 Michael S. Tsirkin    2017-03-29  4836  	struct virtnet_info *vi;
fe36cbe0671e86 Michael S. Tsirkin    2017-03-29  4837  	u16 max_queue_pairs;
4959aebba8c069 Gavin Li              2022-09-14  4838  	int mtu = 0;
fe36cbe0671e86 Michael S. Tsirkin    2017-03-29  4839  
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4840  	/* Find if host supports multiqueue/rss virtio_net device */
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4841  	max_queue_pairs = 1;
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4842  	if (virtio_has_feature(vdev, VIRTIO_NET_F_MQ) || virtio_has_feature(vdev, VIRTIO_NET_F_RSS))
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4843  		max_queue_pairs =
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4844  		     virtio_cread16(vdev, offsetof(struct virtio_net_config, max_virtqueue_pairs));
986a4f4d452dec Jason Wang            2012-12-07  4845  
986a4f4d452dec Jason Wang            2012-12-07  4846  	/* We need at least 2 queue's */
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4847  	if (max_queue_pairs < VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN ||
986a4f4d452dec Jason Wang            2012-12-07  4848  	    max_queue_pairs > VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX ||
986a4f4d452dec Jason Wang            2012-12-07  4849  	    !virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
986a4f4d452dec Jason Wang            2012-12-07  4850  		max_queue_pairs = 1;
296f96fcfc160e Rusty Russell         2007-10-22  4851  
296f96fcfc160e Rusty Russell         2007-10-22  4852  	/* Allocate ourselves a network device with room for our info */
986a4f4d452dec Jason Wang            2012-12-07  4853  	dev = alloc_etherdev_mq(sizeof(struct virtnet_info), max_queue_pairs);
296f96fcfc160e Rusty Russell         2007-10-22  4854  	if (!dev)
296f96fcfc160e Rusty Russell         2007-10-22  4855  		return -ENOMEM;
296f96fcfc160e Rusty Russell         2007-10-22  4856  
296f96fcfc160e Rusty Russell         2007-10-22  4857  	/* Set up network device as normal. */
ab5bd583b92896 Xuan Zhuo             2021-02-18  4858  	dev->priv_flags |= IFF_UNICAST_FLT | IFF_LIVE_ADDR_CHANGE |
ab5bd583b92896 Xuan Zhuo             2021-02-18  4859  			   IFF_TX_SKB_NO_LINEAR;
76288b4e573e06 Stephen Hemminger     2009-01-06  4860  	dev->netdev_ops = &virtnet_netdev;
296f96fcfc160e Rusty Russell         2007-10-22  4861  	dev->features = NETIF_F_HIGHDMA;
3fa2a1df909482 stephen hemminger     2011-06-15  4862  
7ad24ea4bf620a Wilfried Klaebe       2014-05-11  4863  	dev->ethtool_ops = &virtnet_ethtool_ops;
296f96fcfc160e Rusty Russell         2007-10-22  4864  	SET_NETDEV_DEV(dev, &vdev->dev);
296f96fcfc160e Rusty Russell         2007-10-22  4865  
296f96fcfc160e Rusty Russell         2007-10-22  4866  	/* Do we support "hardware" checksums? */
98e778c9aa4f4f Michał Mirosław       2011-03-31  4867  	if (virtio_has_feature(vdev, VIRTIO_NET_F_CSUM)) {
296f96fcfc160e Rusty Russell         2007-10-22  4868  		/* This opens up the world of extra features. */
48900cb6af4282 Jason Wang            2015-08-05  4869  		dev->hw_features |= NETIF_F_HW_CSUM | NETIF_F_SG;
98e778c9aa4f4f Michał Mirosław       2011-03-31  4870  		if (csum)
48900cb6af4282 Jason Wang            2015-08-05  4871  			dev->features |= NETIF_F_HW_CSUM | NETIF_F_SG;
98e778c9aa4f4f Michał Mirosław       2011-03-31  4872  
98e778c9aa4f4f Michał Mirosław       2011-03-31  4873  		if (virtio_has_feature(vdev, VIRTIO_NET_F_GSO)) {
e078de03788353 David S. Miller       2017-07-03  4874  			dev->hw_features |= NETIF_F_TSO
34a48579e4fb38 Rusty Russell         2008-02-04  4875  				| NETIF_F_TSO_ECN | NETIF_F_TSO6;
34a48579e4fb38 Rusty Russell         2008-02-04  4876  		}
5539ae9613587e Rusty Russell         2008-05-02  4877  		/* Individual feature bits: what can host handle? */
98e778c9aa4f4f Michał Mirosław       2011-03-31  4878  		if (virtio_has_feature(vdev, VIRTIO_NET_F_HOST_TSO4))
98e778c9aa4f4f Michał Mirosław       2011-03-31  4879  			dev->hw_features |= NETIF_F_TSO;
98e778c9aa4f4f Michał Mirosław       2011-03-31  4880  		if (virtio_has_feature(vdev, VIRTIO_NET_F_HOST_TSO6))
98e778c9aa4f4f Michał Mirosław       2011-03-31  4881  			dev->hw_features |= NETIF_F_TSO6;
98e778c9aa4f4f Michał Mirosław       2011-03-31  4882  		if (virtio_has_feature(vdev, VIRTIO_NET_F_HOST_ECN))
98e778c9aa4f4f Michał Mirosław       2011-03-31  4883  			dev->hw_features |= NETIF_F_TSO_ECN;
418044e1de3063 Andrew Melnychenko    2022-12-07  4884  		if (virtio_has_feature(vdev, VIRTIO_NET_F_HOST_USO))
418044e1de3063 Andrew Melnychenko    2022-12-07  4885  			dev->hw_features |= NETIF_F_GSO_UDP_L4;
98e778c9aa4f4f Michał Mirosław       2011-03-31  4886  
41f2f1273caee2 Jason Wang            2014-12-24  4887  		dev->features |= NETIF_F_GSO_ROBUST;
41f2f1273caee2 Jason Wang            2014-12-24  4888  
98e778c9aa4f4f Michał Mirosław       2011-03-31  4889  		if (gso)
e078de03788353 David S. Miller       2017-07-03  4890  			dev->features |= dev->hw_features & NETIF_F_ALL_TSO;
98e778c9aa4f4f Michał Mirosław       2011-03-31  4891  		/* (!csum && gso) case will be fixed by register_netdev() */
296f96fcfc160e Rusty Russell         2007-10-22  4892  	}
4f49129be6fa9b Thomas Huth           2013-08-27  4893  	if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_CSUM))
4f49129be6fa9b Thomas Huth           2013-08-27  4894  		dev->features |= NETIF_F_RXCSUM;
a02e8964eaf927 Willem de Bruijn      2018-12-20  4895  	if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) ||
a02e8964eaf927 Willem de Bruijn      2018-12-20  4896  	    virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO6))
dbcf24d1538844 Jason Wang            2021-08-17  4897  		dev->features |= NETIF_F_GRO_HW;
cf8691cbc28659 Michael S. Tsirkin    2020-10-21  4898  	if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS))
dbcf24d1538844 Jason Wang            2021-08-17  4899  		dev->hw_features |= NETIF_F_GRO_HW;
296f96fcfc160e Rusty Russell         2007-10-22  4900  
4fda830263c52a Jason Wang            2013-04-10  4901  	dev->vlan_features = dev->features;
66c0e13ad236c7 Marek Majtyka         2023-02-01  4902  	dev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT;
4fda830263c52a Jason Wang            2013-04-10  4903  
d0c2c9973ecd26 Jarod Wilson          2016-10-20  4904  	/* MTU range: 68 - 65535 */
d0c2c9973ecd26 Jarod Wilson          2016-10-20  4905  	dev->min_mtu = MIN_MTU;
d0c2c9973ecd26 Jarod Wilson          2016-10-20  4906  	dev->max_mtu = MAX_MTU;
d0c2c9973ecd26 Jarod Wilson          2016-10-20  4907  
296f96fcfc160e Rusty Russell         2007-10-22  4908  	/* Configuration may specify what MAC to use.  Otherwise random. */
f2edaa4ad5d513 Jakub Kicinski        2021-10-27  4909  	if (virtio_has_feature(vdev, VIRTIO_NET_F_MAC)) {
f2edaa4ad5d513 Jakub Kicinski        2021-10-27  4910  		u8 addr[ETH_ALEN];
f2edaa4ad5d513 Jakub Kicinski        2021-10-27  4911  
855e0c5288177b Rusty Russell         2013-10-14  4912  		virtio_cread_bytes(vdev,
a586d4f6016f71 Rusty Russell         2008-02-04  4913  				   offsetof(struct virtio_net_config, mac),
f2edaa4ad5d513 Jakub Kicinski        2021-10-27  4914  				   addr, ETH_ALEN);
f2edaa4ad5d513 Jakub Kicinski        2021-10-27  4915  		eth_hw_addr_set(dev, addr);
f2edaa4ad5d513 Jakub Kicinski        2021-10-27  4916  	} else {
f2cedb63df1434 Danny Kukawka         2012-02-15  4917  		eth_hw_addr_random(dev);
9f62d221a4b0aa Laurent Vivier        2023-01-27  4918  		dev_info(&vdev->dev, "Assigned random MAC address %pM\n",
9f62d221a4b0aa Laurent Vivier        2023-01-27  4919  			 dev->dev_addr);
f2edaa4ad5d513 Jakub Kicinski        2021-10-27  4920  	}
296f96fcfc160e Rusty Russell         2007-10-22  4921  
296f96fcfc160e Rusty Russell         2007-10-22  4922  	/* Set up our device-specific information */
296f96fcfc160e Rusty Russell         2007-10-22  4923  	vi = netdev_priv(dev);
296f96fcfc160e Rusty Russell         2007-10-22  4924  	vi->dev = dev;
296f96fcfc160e Rusty Russell         2007-10-22  4925  	vi->vdev = vdev;
d9d5dcc88ca5c7 Christian Borntraeger 2008-02-18  4926  	vdev->priv = vi;
827da44c61419f John Stultz           2013-10-07  4927  
586d17c5a01bf1 Jason Wang            2012-04-11  4928  	INIT_WORK(&vi->config_work, virtnet_config_changed_work);
b9f7425239a099 Jason Wang            2023-07-20  4929  	INIT_WORK(&vi->rx_mode_work, virtnet_rx_mode_work);
5a159128faff15 Jason Wang            2022-07-25  4930  	spin_lock_init(&vi->refill_lock);
296f96fcfc160e Rusty Russell         2007-10-22  4931  
30bbf891f1b8fd Lorenzo Bianconi      2023-02-07  4932  	if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) {
3f2c31d90327f2 Mark McLoughlin       2008-11-16  4933  		vi->mergeable_rx_bufs = true;
30bbf891f1b8fd Lorenzo Bianconi      2023-02-07  4934  		dev->xdp_features |= NETDEV_XDP_ACT_RX_SG;
30bbf891f1b8fd Lorenzo Bianconi      2023-02-07  4935  	}
3f2c31d90327f2 Mark McLoughlin       2008-11-16  4936  
91f41f01d2195d Andrew Melnychenko    2022-03-28  4937  	if (virtio_has_feature(vdev, VIRTIO_NET_F_HASH_REPORT))
91f41f01d2195d Andrew Melnychenko    2022-03-28  4938  		vi->has_rss_hash_report = true;
91f41f01d2195d Andrew Melnychenko    2022-03-28  4939  
91f41f01d2195d Andrew Melnychenko    2022-03-28  4940  	if (virtio_has_feature(vdev, VIRTIO_NET_F_RSS))
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4941  		vi->has_rss = true;
91f41f01d2195d Andrew Melnychenko    2022-03-28  4942  
91f41f01d2195d Andrew Melnychenko    2022-03-28  4943  	if (vi->has_rss || vi->has_rss_hash_report) {
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4944  		vi->rss_indir_table_size =
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4945  			virtio_cread16(vdev, offsetof(struct virtio_net_config,
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4946  				rss_max_indirection_table_length));
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4947  		vi->rss_key_size =
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4948  			virtio_cread8(vdev, offsetof(struct virtio_net_config, rss_max_key_size));
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4949  
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4950  		vi->rss_hash_types_supported =
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4951  		    virtio_cread32(vdev, offsetof(struct virtio_net_config, supported_hash_types));
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4952  		vi->rss_hash_types_supported &=
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4953  				~(VIRTIO_NET_RSS_HASH_TYPE_IP_EX |
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4954  				  VIRTIO_NET_RSS_HASH_TYPE_TCP_EX |
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4955  				  VIRTIO_NET_RSS_HASH_TYPE_UDP_EX);
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4956  
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4957  		dev->hw_features |= NETIF_F_RXHASH;
c7114b1249fa3b Andrew Melnychenko    2022-03-28  4958  	}
91f41f01d2195d Andrew Melnychenko    2022-03-28  4959  
91f41f01d2195d Andrew Melnychenko    2022-03-28  4960  	if (vi->has_rss_hash_report)
91f41f01d2195d Andrew Melnychenko    2022-03-28  4961  		vi->hdr_len = sizeof(struct virtio_net_hdr_v1_hash);
91f41f01d2195d Andrew Melnychenko    2022-03-28  4962  	else if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF) ||
d04302b334bde9 Michael S. Tsirkin    2014-10-24  4963  		 virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
012873d057a449 Michael S. Tsirkin    2014-10-24  4964  		vi->hdr_len = sizeof(struct virtio_net_hdr_mrg_rxbuf);
012873d057a449 Michael S. Tsirkin    2014-10-24  4965  	else
012873d057a449 Michael S. Tsirkin    2014-10-24  4966  		vi->hdr_len = sizeof(struct virtio_net_hdr);
012873d057a449 Michael S. Tsirkin    2014-10-24  4967  
75993300d008f4 Michael S. Tsirkin    2015-07-15  4968  	if (virtio_has_feature(vdev, VIRTIO_F_ANY_LAYOUT) ||
75993300d008f4 Michael S. Tsirkin    2015-07-15  4969  	    virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
e7428e95a06fb5 Michael S. Tsirkin    2013-07-25  4970  		vi->any_header_sg = true;
e7428e95a06fb5 Michael S. Tsirkin    2013-07-25  4971  
986a4f4d452dec Jason Wang            2012-12-07  4972  	if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
986a4f4d452dec Jason Wang            2012-12-07  4973  		vi->has_cvq = true;
986a4f4d452dec Jason Wang            2012-12-07  4974  
14de9d114a82a5 Aaron Conole          2016-06-03  4975  	if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) {
14de9d114a82a5 Aaron Conole          2016-06-03  4976  		mtu = virtio_cread16(vdev,
14de9d114a82a5 Aaron Conole          2016-06-03  4977  				     offsetof(struct virtio_net_config,
14de9d114a82a5 Aaron Conole          2016-06-03  4978  					      mtu));
93a205ee98a488 Aaron Conole          2016-10-25  4979  		if (mtu < dev->min_mtu) {
fe36cbe0671e86 Michael S. Tsirkin    2017-03-29  4980  			/* Should never trigger: MTU was previously validated
fe36cbe0671e86 Michael S. Tsirkin    2017-03-29  4981  			 * in virtnet_validate.
fe36cbe0671e86 Michael S. Tsirkin    2017-03-29  4982  			 */
7934b481ab1a36 Yuval Shaia           2019-04-03  4983  			dev_err(&vdev->dev,
7934b481ab1a36 Yuval Shaia           2019-04-03  4984  				"device MTU appears to have changed it is now %d < %d",
7934b481ab1a36 Yuval Shaia           2019-04-03  4985  				mtu, dev->min_mtu);
411ea23a76526e Dan Carpenter         2020-12-04  4986  			err = -EINVAL;
d7dfc5cf56b0e3 Toshiaki Makita       2018-01-17  4987  			goto free;
fe36cbe0671e86 Michael S. Tsirkin    2017-03-29  4988  		}
fe36cbe0671e86 Michael S. Tsirkin    2017-03-29  4989  
d0c2c9973ecd26 Jarod Wilson          2016-10-20  4990  		dev->mtu = mtu;
93a205ee98a488 Aaron Conole          2016-10-25  4991  		dev->max_mtu = mtu;
14de9d114a82a5 Aaron Conole          2016-06-03  4992  	}
14de9d114a82a5 Aaron Conole          2016-06-03  4993  
4959aebba8c069 Gavin Li              2022-09-14  4994  	virtnet_set_big_packets(vi, mtu);
4959aebba8c069 Gavin Li              2022-09-14  4995  
012873d057a449 Michael S. Tsirkin    2014-10-24  4996  	if (vi->any_header_sg)
012873d057a449 Michael S. Tsirkin    2014-10-24  4997  		dev->needed_headroom = vi->hdr_len;
6ebbc1a6383fe7 Zhangjie \(HZ\        2014-04-29  4998) 
44900010290125 Jason Wang            2016-11-25  4999  	/* Enable multiqueue by default */
44900010290125 Jason Wang            2016-11-25  5000  	if (num_online_cpus() >= max_queue_pairs)
44900010290125 Jason Wang            2016-11-25  5001  		vi->curr_queue_pairs = max_queue_pairs;
44900010290125 Jason Wang            2016-11-25  5002  	else
44900010290125 Jason Wang            2016-11-25  5003  		vi->curr_queue_pairs = num_online_cpus();
986a4f4d452dec Jason Wang            2012-12-07  5004  	vi->max_queue_pairs = max_queue_pairs;
986a4f4d452dec Jason Wang            2012-12-07  5005  
986a4f4d452dec Jason Wang            2012-12-07  5006  	/* Allocate/initialize the rx/tx queues, and invoke find_vqs */
3f9c10b0d478a3 Amit Shah             2011-12-22  5007  	err = init_vqs(vi);
d2a7ddda9ffb1c Michael S. Tsirkin    2009-06-12  5008  	if (err)
d7dfc5cf56b0e3 Toshiaki Makita       2018-01-17  5009  		goto free;
296f96fcfc160e Rusty Russell         2007-10-22  5010  
3014a0d54820d2 Heng Qi               2023-10-08  5011  	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_NOTF_COAL)) {
3014a0d54820d2 Heng Qi               2023-10-08  5012  		vi->intr_coal_rx.max_usecs = 0;
3014a0d54820d2 Heng Qi               2023-10-08  5013  		vi->intr_coal_tx.max_usecs = 0;
3014a0d54820d2 Heng Qi               2023-10-08  5014  		vi->intr_coal_rx.max_packets = 0;
3014a0d54820d2 Heng Qi               2023-10-08  5015  
3014a0d54820d2 Heng Qi               2023-10-08  5016  		/* Keep the default values of the coalescing parameters
3014a0d54820d2 Heng Qi               2023-10-08  5017  		 * aligned with the default napi_tx state.
3014a0d54820d2 Heng Qi               2023-10-08  5018  		 */
3014a0d54820d2 Heng Qi               2023-10-08  5019  		if (vi->sq[0].napi.weight)
3014a0d54820d2 Heng Qi               2023-10-08  5020  			vi->intr_coal_tx.max_packets = 1;
3014a0d54820d2 Heng Qi               2023-10-08  5021  		else
3014a0d54820d2 Heng Qi               2023-10-08  5022  			vi->intr_coal_tx.max_packets = 0;
3014a0d54820d2 Heng Qi               2023-10-08  5023  	}
3014a0d54820d2 Heng Qi               2023-10-08  5024  
d8cd72f1622753 Heng Qi               2024-03-21  5025  	INIT_LIST_HEAD(&vi->coal_list);
3014a0d54820d2 Heng Qi               2023-10-08  5026  	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) {
d8cd72f1622753 Heng Qi               2024-03-21  5027  		vi->cvq_cmd_nums = 0;
d8cd72f1622753 Heng Qi               2024-03-21  5028  		vi->dim_loop_index = 0;
d8cd72f1622753 Heng Qi               2024-03-21  5029  
d8cd72f1622753 Heng Qi               2024-03-21  5030  		if (virtnet_init_coal_list(vi))
d8cd72f1622753 Heng Qi               2024-03-21 @5031  			goto free;
d8cd72f1622753 Heng Qi               2024-03-21  5032  
3014a0d54820d2 Heng Qi               2023-10-08  5033  		/* The reason is the same as VIRTIO_NET_F_NOTF_COAL. */
d8cd72f1622753 Heng Qi               2024-03-21  5034  		for (i = 0; i < vi->max_queue_pairs; i++) {
d8cd72f1622753 Heng Qi               2024-03-21  5035  			vi->rq[i].packets_in_napi = 0;
3014a0d54820d2 Heng Qi               2023-10-08  5036  			if (vi->sq[i].napi.weight)
3014a0d54820d2 Heng Qi               2023-10-08  5037  				vi->sq[i].intr_coal.max_packets = 1;
3014a0d54820d2 Heng Qi               2023-10-08  5038  		}
d8cd72f1622753 Heng Qi               2024-03-21  5039  	}
3014a0d54820d2 Heng Qi               2023-10-08  5040  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2024-03-26  6:05 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-21 11:45 [PATCH 0/2] virtio-net: a fix and some updates for virtio dim Heng Qi
2024-03-21 11:45 ` [PATCH 1/2] virtio-net: fix possible dim status unrecoverable Heng Qi
2024-03-22  5:17   ` Jason Wang
2024-03-25  2:11     ` Heng Qi
2024-03-25  6:29       ` Jason Wang
2024-03-25  6:57         ` Heng Qi
2024-03-25  7:06           ` Jason Wang
2024-03-21 11:45 ` [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker Heng Qi
2024-03-22  2:03   ` kernel test robot
2024-03-22  5:19   ` Jason Wang
2024-03-25  2:21     ` Heng Qi
2024-03-25  5:57       ` Jason Wang
2024-03-25  7:17         ` Heng Qi
2024-03-25  7:56           ` Jason Wang
2024-03-25  8:22             ` Heng Qi
2024-03-25  8:42               ` Jason Wang
2024-03-26  2:46                 ` Heng Qi
2024-03-26  4:08                   ` Jason Wang
2024-03-26  5:57                     ` Heng Qi
2024-03-26  6:05                       ` Jason Wang
2024-03-22  6:50   ` Dan Carpenter
2024-03-21 12:25 ` [PATCH 0/2] virtio-net: a fix and some updates for virtio dim Jiri Pirko
2024-03-25  2:23   ` Heng Qi
2024-03-22  4:18 [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.