All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] mlx5_vdpa: should exclude header length and fcs from mtu
@ 2021-02-06 12:29 ` Si-Wei Liu
  0 siblings, 0 replies; 27+ messages in thread
From: Si-Wei Liu @ 2021-02-06 12:29 UTC (permalink / raw)
  To: mst, jasowang, elic; +Cc: linux-kernel, virtualization, netdev, si-wei.liu

When feature VIRTIO_NET_F_MTU is negotiated on mlx5_vdpa,
22 extra bytes worth of MTU length is shown in guest.
This is because the mlx5_query_port_max_mtu API returns
the "hardware" MTU value, which does not just contain the
Ethernet payload, but includes extra lengths starting
from the Ethernet header up to the FCS altogether.

Fix the MTU so packets won't get dropped silently.

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
---
 drivers/vdpa/mlx5/core/mlx5_vdpa.h |  4 ++++
 drivers/vdpa/mlx5/net/mlx5_vnet.c  | 15 ++++++++++++++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
index 08f742f..b6cc53b 100644
--- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h
+++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
@@ -4,9 +4,13 @@
 #ifndef __MLX5_VDPA_H__
 #define __MLX5_VDPA_H__
 
+#include <linux/etherdevice.h>
+#include <linux/if_vlan.h>
 #include <linux/vdpa.h>
 #include <linux/mlx5/driver.h>
 
+#define MLX5V_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN)
+
 struct mlx5_vdpa_direct_mr {
 	u64 start;
 	u64 end;
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index dc88559..b8416c4 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1907,6 +1907,19 @@ static int mlx5_get_vq_irq(struct vdpa_device *vdv, u16 idx)
 	.free = mlx5_vdpa_free,
 };
 
+static int query_mtu(struct mlx5_core_dev *mdev, u16 *mtu)
+{
+	u16 hw_mtu;
+	int err;
+
+	err = mlx5_query_nic_vport_mtu(mdev, &hw_mtu);
+	if (err)
+		return err;
+
+	*mtu = hw_mtu - MLX5V_ETH_HARD_MTU;
+	return 0;
+}
+
 static int alloc_resources(struct mlx5_vdpa_net *ndev)
 {
 	struct mlx5_vdpa_net_resources *res = &ndev->res;
@@ -1992,7 +2005,7 @@ static int mlx5v_probe(struct auxiliary_device *adev,
 	init_mvqs(ndev);
 	mutex_init(&ndev->reslock);
 	config = &ndev->config;
-	err = mlx5_query_nic_vport_mtu(mdev, &ndev->mtu);
+	err = query_mtu(mdev, &ndev->mtu);
 	if (err)
 		goto err_mtu;
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 1/3] mlx5_vdpa: should exclude header length and fcs from mtu
@ 2021-02-06 12:29 ` Si-Wei Liu
  0 siblings, 0 replies; 27+ messages in thread
From: Si-Wei Liu @ 2021-02-06 12:29 UTC (permalink / raw)
  To: mst, jasowang, elic; +Cc: si-wei.liu, netdev, linux-kernel, virtualization

When feature VIRTIO_NET_F_MTU is negotiated on mlx5_vdpa,
22 extra bytes worth of MTU length is shown in guest.
This is because the mlx5_query_port_max_mtu API returns
the "hardware" MTU value, which does not just contain the
Ethernet payload, but includes extra lengths starting
from the Ethernet header up to the FCS altogether.

Fix the MTU so packets won't get dropped silently.

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
---
 drivers/vdpa/mlx5/core/mlx5_vdpa.h |  4 ++++
 drivers/vdpa/mlx5/net/mlx5_vnet.c  | 15 ++++++++++++++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
index 08f742f..b6cc53b 100644
--- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h
+++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
@@ -4,9 +4,13 @@
 #ifndef __MLX5_VDPA_H__
 #define __MLX5_VDPA_H__
 
+#include <linux/etherdevice.h>
+#include <linux/if_vlan.h>
 #include <linux/vdpa.h>
 #include <linux/mlx5/driver.h>
 
+#define MLX5V_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN)
+
 struct mlx5_vdpa_direct_mr {
 	u64 start;
 	u64 end;
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index dc88559..b8416c4 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1907,6 +1907,19 @@ static int mlx5_get_vq_irq(struct vdpa_device *vdv, u16 idx)
 	.free = mlx5_vdpa_free,
 };
 
+static int query_mtu(struct mlx5_core_dev *mdev, u16 *mtu)
+{
+	u16 hw_mtu;
+	int err;
+
+	err = mlx5_query_nic_vport_mtu(mdev, &hw_mtu);
+	if (err)
+		return err;
+
+	*mtu = hw_mtu - MLX5V_ETH_HARD_MTU;
+	return 0;
+}
+
 static int alloc_resources(struct mlx5_vdpa_net *ndev)
 {
 	struct mlx5_vdpa_net_resources *res = &ndev->res;
@@ -1992,7 +2005,7 @@ static int mlx5v_probe(struct auxiliary_device *adev,
 	init_mvqs(ndev);
 	mutex_init(&ndev->reslock);
 	config = &ndev->config;
-	err = mlx5_query_nic_vport_mtu(mdev, &ndev->mtu);
+	err = query_mtu(mdev, &ndev->mtu);
 	if (err)
 		goto err_mtu;
 
-- 
1.8.3.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/3] mlx5_vdpa: fix feature negotiation across device reset
  2021-02-06 12:29 ` Si-Wei Liu
@ 2021-02-06 12:29   ` Si-Wei Liu
  -1 siblings, 0 replies; 27+ messages in thread
From: Si-Wei Liu @ 2021-02-06 12:29 UTC (permalink / raw)
  To: mst, jasowang, elic; +Cc: linux-kernel, virtualization, netdev, si-wei.liu

The mlx_features denotes the capability for which
set of virtio features is supported by device. In
principle, this field needs not be cleared during
virtio device reset, as this capability is static
and does not change across reset.

In fact, the current code may have the assumption
that mlx_features can be reloaded from firmware
via the .get_features ops after device is reset
(via the .set_status ops), which is unfortunately
not true. The userspace VMM might save a copy
of backend capable features and won't call into
kernel again to get it on reset. This causes all
virtio features getting disabled on newly created
virtqs after device reset, while guest would hold
mismatched view of available features. For e.g.,
the guest may still assume tx checksum offload
is available after reset and feature negotiation,
causing frames with bogus (incomplete) checksum
transmitted on the wire.

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index b8416c4..aa6f8cd 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1788,7 +1788,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
 		clear_virtqueues(ndev);
 		mlx5_vdpa_destroy_mr(&ndev->mvdev);
 		ndev->mvdev.status = 0;
-		ndev->mvdev.mlx_features = 0;
 		++mvdev->generation;
 		return;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/3] mlx5_vdpa: fix feature negotiation across device reset
@ 2021-02-06 12:29   ` Si-Wei Liu
  0 siblings, 0 replies; 27+ messages in thread
From: Si-Wei Liu @ 2021-02-06 12:29 UTC (permalink / raw)
  To: mst, jasowang, elic; +Cc: si-wei.liu, netdev, linux-kernel, virtualization

The mlx_features denotes the capability for which
set of virtio features is supported by device. In
principle, this field needs not be cleared during
virtio device reset, as this capability is static
and does not change across reset.

In fact, the current code may have the assumption
that mlx_features can be reloaded from firmware
via the .get_features ops after device is reset
(via the .set_status ops), which is unfortunately
not true. The userspace VMM might save a copy
of backend capable features and won't call into
kernel again to get it on reset. This causes all
virtio features getting disabled on newly created
virtqs after device reset, while guest would hold
mismatched view of available features. For e.g.,
the guest may still assume tx checksum offload
is available after reset and feature negotiation,
causing frames with bogus (incomplete) checksum
transmitted on the wire.

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index b8416c4..aa6f8cd 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1788,7 +1788,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
 		clear_virtqueues(ndev);
 		mlx5_vdpa_destroy_mr(&ndev->mvdev);
 		ndev->mvdev.status = 0;
-		ndev->mvdev.mlx_features = 0;
 		++mvdev->generation;
 		return;
 	}
-- 
1.8.3.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
  2021-02-06 12:29 ` Si-Wei Liu
@ 2021-02-06 12:29   ` Si-Wei Liu
  -1 siblings, 0 replies; 27+ messages in thread
From: Si-Wei Liu @ 2021-02-06 12:29 UTC (permalink / raw)
  To: mst, jasowang, elic; +Cc: linux-kernel, virtualization, netdev, si-wei.liu

While virtq is stopped,  get_vq_state() is supposed to
be  called to  get  sync'ed  with  the latest internal
avail_index from device. The saved avail_index is used
to restate  the virtq  once device is started.  Commit
b35ccebe3ef7 introduced the clear_virtqueues() routine
to  reset  the saved  avail_index,  however, the index
gets cleared a bit earlier before get_vq_state() tries
to read it. This would cause consistency problems when
virtq is restarted, e.g. through a series of link down
and link up events. We  could  defer  the  clearing of
avail_index  to  until  the  device  is to be started,
i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
set_status().

Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index aa6f8cd..444ab58 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
 	if (!status) {
 		mlx5_vdpa_info(mvdev, "performing device reset\n");
 		teardown_driver(ndev);
-		clear_virtqueues(ndev);
 		mlx5_vdpa_destroy_mr(&ndev->mvdev);
 		ndev->mvdev.status = 0;
 		++mvdev->generation;
@@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
 
 	if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
 		if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
+			clear_virtqueues(ndev);
 			err = setup_driver(ndev);
 			if (err) {
 				mlx5_vdpa_warn(mvdev, "failed to setup driver\n");
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
@ 2021-02-06 12:29   ` Si-Wei Liu
  0 siblings, 0 replies; 27+ messages in thread
From: Si-Wei Liu @ 2021-02-06 12:29 UTC (permalink / raw)
  To: mst, jasowang, elic; +Cc: si-wei.liu, netdev, linux-kernel, virtualization

While virtq is stopped,  get_vq_state() is supposed to
be  called to  get  sync'ed  with  the latest internal
avail_index from device. The saved avail_index is used
to restate  the virtq  once device is started.  Commit
b35ccebe3ef7 introduced the clear_virtqueues() routine
to  reset  the saved  avail_index,  however, the index
gets cleared a bit earlier before get_vq_state() tries
to read it. This would cause consistency problems when
virtq is restarted, e.g. through a series of link down
and link up events. We  could  defer  the  clearing of
avail_index  to  until  the  device  is to be started,
i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
set_status().

Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index aa6f8cd..444ab58 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
 	if (!status) {
 		mlx5_vdpa_info(mvdev, "performing device reset\n");
 		teardown_driver(ndev);
-		clear_virtqueues(ndev);
 		mlx5_vdpa_destroy_mr(&ndev->mvdev);
 		ndev->mvdev.status = 0;
 		++mvdev->generation;
@@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
 
 	if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
 		if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
+			clear_virtqueues(ndev);
 			err = setup_driver(ndev);
 			if (err) {
 				mlx5_vdpa_warn(mvdev, "failed to setup driver\n");
-- 
1.8.3.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] mlx5_vdpa: fix feature negotiation across device reset
  2021-02-06 12:29   ` Si-Wei Liu
@ 2021-02-08  4:37     ` Jason Wang
  -1 siblings, 0 replies; 27+ messages in thread
From: Jason Wang @ 2021-02-08  4:37 UTC (permalink / raw)
  To: Si-Wei Liu, mst, elic; +Cc: linux-kernel, virtualization, netdev


On 2021/2/6 下午8:29, Si-Wei Liu wrote:
> The mlx_features denotes the capability for which
> set of virtio features is supported by device. In
> principle, this field needs not be cleared during
> virtio device reset, as this capability is static
> and does not change across reset.
>
> In fact, the current code may have the assumption
> that mlx_features can be reloaded from firmware
> via the .get_features ops after device is reset
> (via the .set_status ops), which is unfortunately
> not true. The userspace VMM might save a copy
> of backend capable features and won't call into
> kernel again to get it on reset.


This is not the behavior of Qemu but it's valid.


> This causes all
> virtio features getting disabled on newly created
> virtqs after device reset, while guest would hold
> mismatched view of available features. For e.g.,
> the guest may still assume tx checksum offload
> is available after reset and feature negotiation,
> causing frames with bogus (incomplete) checksum
> transmitted on the wire.
>
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 1 -
>   1 file changed, 1 deletion(-)
>
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index b8416c4..aa6f8cd 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1788,7 +1788,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>   		clear_virtqueues(ndev);
>   		mlx5_vdpa_destroy_mr(&ndev->mvdev);
>   		ndev->mvdev.status = 0;
> -		ndev->mvdev.mlx_features = 0;
>   		++mvdev->generation;
>   		return;
>   	}


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] mlx5_vdpa: fix feature negotiation across device reset
@ 2021-02-08  4:37     ` Jason Wang
  0 siblings, 0 replies; 27+ messages in thread
From: Jason Wang @ 2021-02-08  4:37 UTC (permalink / raw)
  To: Si-Wei Liu, mst, elic; +Cc: netdev, linux-kernel, virtualization


On 2021/2/6 下午8:29, Si-Wei Liu wrote:
> The mlx_features denotes the capability for which
> set of virtio features is supported by device. In
> principle, this field needs not be cleared during
> virtio device reset, as this capability is static
> and does not change across reset.
>
> In fact, the current code may have the assumption
> that mlx_features can be reloaded from firmware
> via the .get_features ops after device is reset
> (via the .set_status ops), which is unfortunately
> not true. The userspace VMM might save a copy
> of backend capable features and won't call into
> kernel again to get it on reset.


This is not the behavior of Qemu but it's valid.


> This causes all
> virtio features getting disabled on newly created
> virtqs after device reset, while guest would hold
> mismatched view of available features. For e.g.,
> the guest may still assume tx checksum offload
> is available after reset and feature negotiation,
> causing frames with bogus (incomplete) checksum
> transmitted on the wire.
>
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 1 -
>   1 file changed, 1 deletion(-)
>
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index b8416c4..aa6f8cd 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1788,7 +1788,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>   		clear_virtqueues(ndev);
>   		mlx5_vdpa_destroy_mr(&ndev->mvdev);
>   		ndev->mvdev.status = 0;
> -		ndev->mvdev.mlx_features = 0;
>   		++mvdev->generation;
>   		return;
>   	}

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
  2021-02-06 12:29   ` Si-Wei Liu
@ 2021-02-08  4:38     ` Jason Wang
  -1 siblings, 0 replies; 27+ messages in thread
From: Jason Wang @ 2021-02-08  4:38 UTC (permalink / raw)
  To: Si-Wei Liu, mst, elic; +Cc: linux-kernel, virtualization, netdev


On 2021/2/6 下午8:29, Si-Wei Liu wrote:
> While virtq is stopped,  get_vq_state() is supposed to
> be  called to  get  sync'ed  with  the latest internal
> avail_index from device. The saved avail_index is used
> to restate  the virtq  once device is started.  Commit
> b35ccebe3ef7 introduced the clear_virtqueues() routine
> to  reset  the saved  avail_index,  however, the index
> gets cleared a bit earlier before get_vq_state() tries
> to read it. This would cause consistency problems when
> virtq is restarted, e.g. through a series of link down
> and link up events. We  could  defer  the  clearing of
> avail_index  to  until  the  device  is to be started,
> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
> set_status().
>
> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index aa6f8cd..444ab58 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>   	if (!status) {
>   		mlx5_vdpa_info(mvdev, "performing device reset\n");
>   		teardown_driver(ndev);
> -		clear_virtqueues(ndev);
>   		mlx5_vdpa_destroy_mr(&ndev->mvdev);
>   		ndev->mvdev.status = 0;
>   		++mvdev->generation;
> @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>   
>   	if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
>   		if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
> +			clear_virtqueues(ndev);
>   			err = setup_driver(ndev);
>   			if (err) {
>   				mlx5_vdpa_warn(mvdev, "failed to setup driver\n");


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
@ 2021-02-08  4:38     ` Jason Wang
  0 siblings, 0 replies; 27+ messages in thread
From: Jason Wang @ 2021-02-08  4:38 UTC (permalink / raw)
  To: Si-Wei Liu, mst, elic; +Cc: netdev, linux-kernel, virtualization


On 2021/2/6 下午8:29, Si-Wei Liu wrote:
> While virtq is stopped,  get_vq_state() is supposed to
> be  called to  get  sync'ed  with  the latest internal
> avail_index from device. The saved avail_index is used
> to restate  the virtq  once device is started.  Commit
> b35ccebe3ef7 introduced the clear_virtqueues() routine
> to  reset  the saved  avail_index,  however, the index
> gets cleared a bit earlier before get_vq_state() tries
> to read it. This would cause consistency problems when
> virtq is restarted, e.g. through a series of link down
> and link up events. We  could  defer  the  clearing of
> avail_index  to  until  the  device  is to be started,
> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
> set_status().
>
> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index aa6f8cd..444ab58 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>   	if (!status) {
>   		mlx5_vdpa_info(mvdev, "performing device reset\n");
>   		teardown_driver(ndev);
> -		clear_virtqueues(ndev);
>   		mlx5_vdpa_destroy_mr(&ndev->mvdev);
>   		ndev->mvdev.status = 0;
>   		++mvdev->generation;
> @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>   
>   	if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
>   		if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
> +			clear_virtqueues(ndev);
>   			err = setup_driver(ndev);
>   			if (err) {
>   				mlx5_vdpa_warn(mvdev, "failed to setup driver\n");

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/3] mlx5_vdpa: should exclude header length and fcs from mtu
  2021-02-06 12:29 ` Si-Wei Liu
@ 2021-02-08  4:38   ` Jason Wang
  -1 siblings, 0 replies; 27+ messages in thread
From: Jason Wang @ 2021-02-08  4:38 UTC (permalink / raw)
  To: Si-Wei Liu, mst, elic; +Cc: linux-kernel, virtualization, netdev


On 2021/2/6 下午8:29, Si-Wei Liu wrote:
> When feature VIRTIO_NET_F_MTU is negotiated on mlx5_vdpa,
> 22 extra bytes worth of MTU length is shown in guest.
> This is because the mlx5_query_port_max_mtu API returns
> the "hardware" MTU value, which does not just contain the
> Ethernet payload, but includes extra lengths starting
> from the Ethernet header up to the FCS altogether.
>
> Fix the MTU so packets won't get dropped silently.
>
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
>   drivers/vdpa/mlx5/core/mlx5_vdpa.h |  4 ++++
>   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 15 ++++++++++++++-
>   2 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> index 08f742f..b6cc53b 100644
> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> @@ -4,9 +4,13 @@
>   #ifndef __MLX5_VDPA_H__
>   #define __MLX5_VDPA_H__
>   
> +#include <linux/etherdevice.h>
> +#include <linux/if_vlan.h>
>   #include <linux/vdpa.h>
>   #include <linux/mlx5/driver.h>
>   
> +#define MLX5V_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN)
> +
>   struct mlx5_vdpa_direct_mr {
>   	u64 start;
>   	u64 end;
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index dc88559..b8416c4 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1907,6 +1907,19 @@ static int mlx5_get_vq_irq(struct vdpa_device *vdv, u16 idx)
>   	.free = mlx5_vdpa_free,
>   };
>   
> +static int query_mtu(struct mlx5_core_dev *mdev, u16 *mtu)
> +{
> +	u16 hw_mtu;
> +	int err;
> +
> +	err = mlx5_query_nic_vport_mtu(mdev, &hw_mtu);
> +	if (err)
> +		return err;
> +
> +	*mtu = hw_mtu - MLX5V_ETH_HARD_MTU;
> +	return 0;
> +}
> +
>   static int alloc_resources(struct mlx5_vdpa_net *ndev)
>   {
>   	struct mlx5_vdpa_net_resources *res = &ndev->res;
> @@ -1992,7 +2005,7 @@ static int mlx5v_probe(struct auxiliary_device *adev,
>   	init_mvqs(ndev);
>   	mutex_init(&ndev->reslock);
>   	config = &ndev->config;
> -	err = mlx5_query_nic_vport_mtu(mdev, &ndev->mtu);
> +	err = query_mtu(mdev, &ndev->mtu);
>   	if (err)
>   		goto err_mtu;
>   


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/3] mlx5_vdpa: should exclude header length and fcs from mtu
@ 2021-02-08  4:38   ` Jason Wang
  0 siblings, 0 replies; 27+ messages in thread
From: Jason Wang @ 2021-02-08  4:38 UTC (permalink / raw)
  To: Si-Wei Liu, mst, elic; +Cc: netdev, linux-kernel, virtualization


On 2021/2/6 下午8:29, Si-Wei Liu wrote:
> When feature VIRTIO_NET_F_MTU is negotiated on mlx5_vdpa,
> 22 extra bytes worth of MTU length is shown in guest.
> This is because the mlx5_query_port_max_mtu API returns
> the "hardware" MTU value, which does not just contain the
> Ethernet payload, but includes extra lengths starting
> from the Ethernet header up to the FCS altogether.
>
> Fix the MTU so packets won't get dropped silently.
>
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
>   drivers/vdpa/mlx5/core/mlx5_vdpa.h |  4 ++++
>   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 15 ++++++++++++++-
>   2 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> index 08f742f..b6cc53b 100644
> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> @@ -4,9 +4,13 @@
>   #ifndef __MLX5_VDPA_H__
>   #define __MLX5_VDPA_H__
>   
> +#include <linux/etherdevice.h>
> +#include <linux/if_vlan.h>
>   #include <linux/vdpa.h>
>   #include <linux/mlx5/driver.h>
>   
> +#define MLX5V_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN)
> +
>   struct mlx5_vdpa_direct_mr {
>   	u64 start;
>   	u64 end;
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index dc88559..b8416c4 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1907,6 +1907,19 @@ static int mlx5_get_vq_irq(struct vdpa_device *vdv, u16 idx)
>   	.free = mlx5_vdpa_free,
>   };
>   
> +static int query_mtu(struct mlx5_core_dev *mdev, u16 *mtu)
> +{
> +	u16 hw_mtu;
> +	int err;
> +
> +	err = mlx5_query_nic_vport_mtu(mdev, &hw_mtu);
> +	if (err)
> +		return err;
> +
> +	*mtu = hw_mtu - MLX5V_ETH_HARD_MTU;
> +	return 0;
> +}
> +
>   static int alloc_resources(struct mlx5_vdpa_net *ndev)
>   {
>   	struct mlx5_vdpa_net_resources *res = &ndev->res;
> @@ -1992,7 +2005,7 @@ static int mlx5v_probe(struct auxiliary_device *adev,
>   	init_mvqs(ndev);
>   	mutex_init(&ndev->reslock);
>   	config = &ndev->config;
> -	err = mlx5_query_nic_vport_mtu(mdev, &ndev->mtu);
> +	err = query_mtu(mdev, &ndev->mtu);
>   	if (err)
>   		goto err_mtu;
>   

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] mlx5_vdpa: fix feature negotiation across device reset
  2021-02-06 12:29   ` Si-Wei Liu
  (?)
  (?)
@ 2021-02-08  5:35   ` Eli Cohen
  2021-02-09  1:20       ` Si-Wei Liu
  -1 siblings, 1 reply; 27+ messages in thread
From: Eli Cohen @ 2021-02-08  5:35 UTC (permalink / raw)
  To: Si-Wei Liu; +Cc: mst, jasowang, linux-kernel, virtualization, netdev

On Sat, Feb 06, 2021 at 04:29:23AM -0800, Si-Wei Liu wrote:
> The mlx_features denotes the capability for which
> set of virtio features is supported by device. In
> principle, this field needs not be cleared during
> virtio device reset, as this capability is static
> and does not change across reset.
> 
> In fact, the current code may have the assumption
> that mlx_features can be reloaded from firmware
> via the .get_features ops after device is reset
> (via the .set_status ops), which is unfortunately
> not true. The userspace VMM might save a copy
> of backend capable features and won't call into
> kernel again to get it on reset. This causes all
> virtio features getting disabled on newly created
> virtqs after device reset, while guest would hold
> mismatched view of available features. For e.g.,
> the guest may still assume tx checksum offload
> is available after reset and feature negotiation,
> causing frames with bogus (incomplete) checksum
> transmitted on the wire.
> 
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> ---
>  drivers/vdpa/mlx5/net/mlx5_vnet.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index b8416c4..aa6f8cd 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1788,7 +1788,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>  		clear_virtqueues(ndev);
>  		mlx5_vdpa_destroy_mr(&ndev->mvdev);
>  		ndev->mvdev.status = 0;
> -		ndev->mvdev.mlx_features = 0;
>  		++mvdev->generation;
>  		return;
>  	}

Since we assume that device capabilities don't change, I think I would
get the features through a call done in mlx5v_probe after the netdev
object is created and change mlx5_vdpa_get_features() to just return
ndev->mvdev.mlx_features.

Did you actually see this issue in action? If you did, can you share
with us how you trigerred this?

> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/3] mlx5_vdpa: should exclude header length and fcs from mtu
  2021-02-06 12:29 ` Si-Wei Liu
                   ` (3 preceding siblings ...)
  (?)
@ 2021-02-08  5:35 ` Eli Cohen
  -1 siblings, 0 replies; 27+ messages in thread
From: Eli Cohen @ 2021-02-08  5:35 UTC (permalink / raw)
  To: Si-Wei Liu; +Cc: mst, jasowang, linux-kernel, virtualization, netdev

On Sat, Feb 06, 2021 at 04:29:22AM -0800, Si-Wei Liu wrote:
> When feature VIRTIO_NET_F_MTU is negotiated on mlx5_vdpa,
> 22 extra bytes worth of MTU length is shown in guest.
> This is because the mlx5_query_port_max_mtu API returns
> the "hardware" MTU value, which does not just contain the
> Ethernet payload, but includes extra lengths starting
> from the Ethernet header up to the FCS altogether.
> 
> Fix the MTU so packets won't get dropped silently.
> 
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>

Acked-by: Eli Cohen <elic@nvidia.com>

> ---
>  drivers/vdpa/mlx5/core/mlx5_vdpa.h |  4 ++++
>  drivers/vdpa/mlx5/net/mlx5_vnet.c  | 15 ++++++++++++++-
>  2 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> index 08f742f..b6cc53b 100644
> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> @@ -4,9 +4,13 @@
>  #ifndef __MLX5_VDPA_H__
>  #define __MLX5_VDPA_H__
>  
> +#include <linux/etherdevice.h>
> +#include <linux/if_vlan.h>
>  #include <linux/vdpa.h>
>  #include <linux/mlx5/driver.h>
>  
> +#define MLX5V_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN)
> +
>  struct mlx5_vdpa_direct_mr {
>  	u64 start;
>  	u64 end;
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index dc88559..b8416c4 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1907,6 +1907,19 @@ static int mlx5_get_vq_irq(struct vdpa_device *vdv, u16 idx)
>  	.free = mlx5_vdpa_free,
>  };
>  
> +static int query_mtu(struct mlx5_core_dev *mdev, u16 *mtu)
> +{
> +	u16 hw_mtu;
> +	int err;
> +
> +	err = mlx5_query_nic_vport_mtu(mdev, &hw_mtu);
> +	if (err)
> +		return err;
> +
> +	*mtu = hw_mtu - MLX5V_ETH_HARD_MTU;
> +	return 0;
> +}
> +
>  static int alloc_resources(struct mlx5_vdpa_net *ndev)
>  {
>  	struct mlx5_vdpa_net_resources *res = &ndev->res;
> @@ -1992,7 +2005,7 @@ static int mlx5v_probe(struct auxiliary_device *adev,
>  	init_mvqs(ndev);
>  	mutex_init(&ndev->reslock);
>  	config = &ndev->config;
> -	err = mlx5_query_nic_vport_mtu(mdev, &ndev->mtu);
> +	err = query_mtu(mdev, &ndev->mtu);
>  	if (err)
>  		goto err_mtu;
>  
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
  2021-02-06 12:29   ` Si-Wei Liu
  (?)
  (?)
@ 2021-02-08  5:48   ` Eli Cohen
  2021-02-09  1:40       ` Si-Wei Liu
  -1 siblings, 1 reply; 27+ messages in thread
From: Eli Cohen @ 2021-02-08  5:48 UTC (permalink / raw)
  To: Si-Wei Liu; +Cc: mst, jasowang, linux-kernel, virtualization, netdev

On Sat, Feb 06, 2021 at 04:29:24AM -0800, Si-Wei Liu wrote:
> While virtq is stopped,  get_vq_state() is supposed to
> be  called to  get  sync'ed  with  the latest internal
> avail_index from device. The saved avail_index is used
> to restate  the virtq  once device is started.  Commit
> b35ccebe3ef7 introduced the clear_virtqueues() routine
> to  reset  the saved  avail_index,  however, the index
> gets cleared a bit earlier before get_vq_state() tries
> to read it. This would cause consistency problems when
> virtq is restarted, e.g. through a series of link down
> and link up events. We  could  defer  the  clearing of
> avail_index  to  until  the  device  is to be started,
> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
> set_status().


Not sure I understand the scenario. You are talking about reset of the
device followed by up/down events on the interface. How can you trigger
this?

> 
> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> ---
>  drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index aa6f8cd..444ab58 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>  	if (!status) {
>  		mlx5_vdpa_info(mvdev, "performing device reset\n");
>  		teardown_driver(ndev);
> -		clear_virtqueues(ndev);
>  		mlx5_vdpa_destroy_mr(&ndev->mvdev);
>  		ndev->mvdev.status = 0;
>  		++mvdev->generation;
> @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>  
>  	if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
>  		if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
> +			clear_virtqueues(ndev);
>  			err = setup_driver(ndev);
>  			if (err) {
>  				mlx5_vdpa_warn(mvdev, "failed to setup driver\n");
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] mlx5_vdpa: fix feature negotiation across device reset
  2021-02-08  5:35   ` Eli Cohen
@ 2021-02-09  1:20       ` Si-Wei Liu
  0 siblings, 0 replies; 27+ messages in thread
From: Si-Wei Liu @ 2021-02-09  1:20 UTC (permalink / raw)
  To: Eli Cohen; +Cc: mst, jasowang, linux-kernel, virtualization, netdev



On 2/7/2021 9:35 PM, Eli Cohen wrote:
> On Sat, Feb 06, 2021 at 04:29:23AM -0800, Si-Wei Liu wrote:
>> The mlx_features denotes the capability for which
>> set of virtio features is supported by device. In
>> principle, this field needs not be cleared during
>> virtio device reset, as this capability is static
>> and does not change across reset.
>>
>> In fact, the current code may have the assumption
>> that mlx_features can be reloaded from firmware
>> via the .get_features ops after device is reset
>> (via the .set_status ops), which is unfortunately
>> not true. The userspace VMM might save a copy
>> of backend capable features and won't call into
>> kernel again to get it on reset. This causes all
>> virtio features getting disabled on newly created
>> virtqs after device reset, while guest would hold
>> mismatched view of available features. For e.g.,
>> the guest may still assume tx checksum offload
>> is available after reset and feature negotiation,
>> causing frames with bogus (incomplete) checksum
>> transmitted on the wire.
>>
>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
>> ---
>>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 1 -
>>   1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> index b8416c4..aa6f8cd 100644
>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> @@ -1788,7 +1788,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>>   		clear_virtqueues(ndev);
>>   		mlx5_vdpa_destroy_mr(&ndev->mvdev);
>>   		ndev->mvdev.status = 0;
>> -		ndev->mvdev.mlx_features = 0;
>>   		++mvdev->generation;
>>   		return;
>>   	}
> Since we assume that device capabilities don't change, I think I would
> get the features through a call done in mlx5v_probe after the netdev
> object is created and change mlx5_vdpa_get_features() to just return
> ndev->mvdev.mlx_features.
Yep, it makes sense. Will post a revised patch. If vdpa tool allows 
reconfiguration post probing, the code has to be reconciled then.

>
> Did you actually see this issue in action? If you did, can you share
> with us how you trigerred this?
Issue is indeed seen in action. The mismatched tx-checksum offload as 
described in the commit message was one of such examples. You would need 
a guest reboot though (triggering device reset via the .set_status ops 
and zero'ed mlx_features was loaded to deduce new actual_features for 
creating the h/w virtq object) for repro.

-Siwei
>
>> -- 
>> 1.8.3.1
>>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] mlx5_vdpa: fix feature negotiation across device reset
@ 2021-02-09  1:20       ` Si-Wei Liu
  0 siblings, 0 replies; 27+ messages in thread
From: Si-Wei Liu @ 2021-02-09  1:20 UTC (permalink / raw)
  To: Eli Cohen; +Cc: netdev, virtualization, linux-kernel, mst



On 2/7/2021 9:35 PM, Eli Cohen wrote:
> On Sat, Feb 06, 2021 at 04:29:23AM -0800, Si-Wei Liu wrote:
>> The mlx_features denotes the capability for which
>> set of virtio features is supported by device. In
>> principle, this field needs not be cleared during
>> virtio device reset, as this capability is static
>> and does not change across reset.
>>
>> In fact, the current code may have the assumption
>> that mlx_features can be reloaded from firmware
>> via the .get_features ops after device is reset
>> (via the .set_status ops), which is unfortunately
>> not true. The userspace VMM might save a copy
>> of backend capable features and won't call into
>> kernel again to get it on reset. This causes all
>> virtio features getting disabled on newly created
>> virtqs after device reset, while guest would hold
>> mismatched view of available features. For e.g.,
>> the guest may still assume tx checksum offload
>> is available after reset and feature negotiation,
>> causing frames with bogus (incomplete) checksum
>> transmitted on the wire.
>>
>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
>> ---
>>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 1 -
>>   1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> index b8416c4..aa6f8cd 100644
>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> @@ -1788,7 +1788,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>>   		clear_virtqueues(ndev);
>>   		mlx5_vdpa_destroy_mr(&ndev->mvdev);
>>   		ndev->mvdev.status = 0;
>> -		ndev->mvdev.mlx_features = 0;
>>   		++mvdev->generation;
>>   		return;
>>   	}
> Since we assume that device capabilities don't change, I think I would
> get the features through a call done in mlx5v_probe after the netdev
> object is created and change mlx5_vdpa_get_features() to just return
> ndev->mvdev.mlx_features.
Yep, it makes sense. Will post a revised patch. If vdpa tool allows 
reconfiguration post probing, the code has to be reconciled then.

>
> Did you actually see this issue in action? If you did, can you share
> with us how you trigerred this?
Issue is indeed seen in action. The mismatched tx-checksum offload as 
described in the commit message was one of such examples. You would need 
a guest reboot though (triggering device reset via the .set_status ops 
and zero'ed mlx_features was loaded to deduce new actual_features for 
creating the h/w virtq object) for repro.

-Siwei
>
>> -- 
>> 1.8.3.1
>>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
  2021-02-08  5:48   ` Eli Cohen
@ 2021-02-09  1:40       ` Si-Wei Liu
  0 siblings, 0 replies; 27+ messages in thread
From: Si-Wei Liu @ 2021-02-09  1:40 UTC (permalink / raw)
  To: Eli Cohen; +Cc: mst, jasowang, linux-kernel, virtualization, netdev



On 2/7/2021 9:48 PM, Eli Cohen wrote:
> On Sat, Feb 06, 2021 at 04:29:24AM -0800, Si-Wei Liu wrote:
>> While virtq is stopped,  get_vq_state() is supposed to
>> be  called to  get  sync'ed  with  the latest internal
>> avail_index from device. The saved avail_index is used
>> to restate  the virtq  once device is started.  Commit
>> b35ccebe3ef7 introduced the clear_virtqueues() routine
>> to  reset  the saved  avail_index,  however, the index
>> gets cleared a bit earlier before get_vq_state() tries
>> to read it. This would cause consistency problems when
>> virtq is restarted, e.g. through a series of link down
>> and link up events. We  could  defer  the  clearing of
>> avail_index  to  until  the  device  is to be started,
>> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
>> set_status().
>
> Not sure I understand the scenario. You are talking about reset of the
> device followed by up/down events on the interface. How can you trigger
> this?
Currently it's not possible to trigger link up/down events with upstream 
QEMU due lack of config/control interrupt implementation. And live 
migration could be another scenario that cannot be satisfied because of 
inconsistent queue state. They share the same root of cause as captured 
here.

-Siwei

>
>> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")
>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
>> ---
>>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> index aa6f8cd..444ab58 100644
>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>>   	if (!status) {
>>   		mlx5_vdpa_info(mvdev, "performing device reset\n");
>>   		teardown_driver(ndev);
>> -		clear_virtqueues(ndev);
>>   		mlx5_vdpa_destroy_mr(&ndev->mvdev);
>>   		ndev->mvdev.status = 0;
>>   		++mvdev->generation;
>> @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>>   
>>   	if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
>>   		if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
>> +			clear_virtqueues(ndev);
>>   			err = setup_driver(ndev);
>>   			if (err) {
>>   				mlx5_vdpa_warn(mvdev, "failed to setup driver\n");
>> -- 
>> 1.8.3.1
>>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
@ 2021-02-09  1:40       ` Si-Wei Liu
  0 siblings, 0 replies; 27+ messages in thread
From: Si-Wei Liu @ 2021-02-09  1:40 UTC (permalink / raw)
  To: Eli Cohen; +Cc: netdev, virtualization, linux-kernel, mst



On 2/7/2021 9:48 PM, Eli Cohen wrote:
> On Sat, Feb 06, 2021 at 04:29:24AM -0800, Si-Wei Liu wrote:
>> While virtq is stopped,  get_vq_state() is supposed to
>> be  called to  get  sync'ed  with  the latest internal
>> avail_index from device. The saved avail_index is used
>> to restate  the virtq  once device is started.  Commit
>> b35ccebe3ef7 introduced the clear_virtqueues() routine
>> to  reset  the saved  avail_index,  however, the index
>> gets cleared a bit earlier before get_vq_state() tries
>> to read it. This would cause consistency problems when
>> virtq is restarted, e.g. through a series of link down
>> and link up events. We  could  defer  the  clearing of
>> avail_index  to  until  the  device  is to be started,
>> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
>> set_status().
>
> Not sure I understand the scenario. You are talking about reset of the
> device followed by up/down events on the interface. How can you trigger
> this?
Currently it's not possible to trigger link up/down events with upstream 
QEMU due lack of config/control interrupt implementation. And live 
migration could be another scenario that cannot be satisfied because of 
inconsistent queue state. They share the same root of cause as captured 
here.

-Siwei

>
>> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")
>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
>> ---
>>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> index aa6f8cd..444ab58 100644
>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>>   	if (!status) {
>>   		mlx5_vdpa_info(mvdev, "performing device reset\n");
>>   		teardown_driver(ndev);
>> -		clear_virtqueues(ndev);
>>   		mlx5_vdpa_destroy_mr(&ndev->mvdev);
>>   		ndev->mvdev.status = 0;
>>   		++mvdev->generation;
>> @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>>   
>>   	if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
>>   		if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
>> +			clear_virtqueues(ndev);
>>   			err = setup_driver(ndev);
>>   			if (err) {
>>   				mlx5_vdpa_warn(mvdev, "failed to setup driver\n");
>> -- 
>> 1.8.3.1
>>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
  2021-02-06 12:29   ` Si-Wei Liu
@ 2021-02-09  3:37     ` Jason Wang
  -1 siblings, 0 replies; 27+ messages in thread
From: Jason Wang @ 2021-02-09  3:37 UTC (permalink / raw)
  To: Si-Wei Liu, mst, elic; +Cc: linux-kernel, virtualization, netdev


On 2021/2/6 下午8:29, Si-Wei Liu wrote:
> While virtq is stopped,  get_vq_state() is supposed to
> be  called to  get  sync'ed  with  the latest internal
> avail_index from device. The saved avail_index is used
> to restate  the virtq  once device is started.  Commit
> b35ccebe3ef7 introduced the clear_virtqueues() routine
> to  reset  the saved  avail_index,  however, the index
> gets cleared a bit earlier before get_vq_state() tries
> to read it. This would cause consistency problems when
> virtq is restarted, e.g. through a series of link down
> and link up events. We  could  defer  the  clearing of
> avail_index  to  until  the  device  is to be started,
> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
> set_status().
>
> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> ---
>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index aa6f8cd..444ab58 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>   	if (!status) {
>   		mlx5_vdpa_info(mvdev, "performing device reset\n");
>   		teardown_driver(ndev);
> -		clear_virtqueues(ndev);
>   		mlx5_vdpa_destroy_mr(&ndev->mvdev);
>   		ndev->mvdev.status = 0;
>   		++mvdev->generation;
> @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>   
>   	if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
>   		if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
> +			clear_virtqueues(ndev);


Rethink about this. As mentioned in another thread, this in fact breaks 
set_vq_state().  (See vhost_virtqueue_start() -> 
vhost_vdpa_set_vring_base() in qemu codes).

The issue is that the avail idx is forgot, we need keep it.

Thanks


>   			err = setup_driver(ndev);
>   			if (err) {
>   				mlx5_vdpa_warn(mvdev, "failed to setup driver\n");


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
@ 2021-02-09  3:37     ` Jason Wang
  0 siblings, 0 replies; 27+ messages in thread
From: Jason Wang @ 2021-02-09  3:37 UTC (permalink / raw)
  To: Si-Wei Liu, mst, elic; +Cc: netdev, linux-kernel, virtualization


On 2021/2/6 下午8:29, Si-Wei Liu wrote:
> While virtq is stopped,  get_vq_state() is supposed to
> be  called to  get  sync'ed  with  the latest internal
> avail_index from device. The saved avail_index is used
> to restate  the virtq  once device is started.  Commit
> b35ccebe3ef7 introduced the clear_virtqueues() routine
> to  reset  the saved  avail_index,  however, the index
> gets cleared a bit earlier before get_vq_state() tries
> to read it. This would cause consistency problems when
> virtq is restarted, e.g. through a series of link down
> and link up events. We  could  defer  the  clearing of
> avail_index  to  until  the  device  is to be started,
> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
> set_status().
>
> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> ---
>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index aa6f8cd..444ab58 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>   	if (!status) {
>   		mlx5_vdpa_info(mvdev, "performing device reset\n");
>   		teardown_driver(ndev);
> -		clear_virtqueues(ndev);
>   		mlx5_vdpa_destroy_mr(&ndev->mvdev);
>   		ndev->mvdev.status = 0;
>   		++mvdev->generation;
> @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
>   
>   	if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
>   		if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
> +			clear_virtqueues(ndev);


Rethink about this. As mentioned in another thread, this in fact breaks 
set_vq_state().  (See vhost_virtqueue_start() -> 
vhost_vdpa_set_vring_base() in qemu codes).

The issue is that the avail idx is forgot, we need keep it.

Thanks


>   			err = setup_driver(ndev);
>   			if (err) {
>   				mlx5_vdpa_warn(mvdev, "failed to setup driver\n");

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
  2021-02-09  3:37     ` Jason Wang
@ 2021-02-10  0:26       ` Si-Wei Liu
  -1 siblings, 0 replies; 27+ messages in thread
From: Si-Wei Liu @ 2021-02-10  0:26 UTC (permalink / raw)
  To: Jason Wang, mst, elic; +Cc: linux-kernel, virtualization, netdev



On 2/8/2021 7:37 PM, Jason Wang wrote:
>
> On 2021/2/6 下午8:29, Si-Wei Liu wrote:
>> While virtq is stopped,  get_vq_state() is supposed to
>> be  called to  get  sync'ed  with  the latest internal
>> avail_index from device. The saved avail_index is used
>> to restate  the virtq  once device is started.  Commit
>> b35ccebe3ef7 introduced the clear_virtqueues() routine
>> to  reset  the saved  avail_index,  however, the index
>> gets cleared a bit earlier before get_vq_state() tries
>> to read it. This would cause consistency problems when
>> virtq is restarted, e.g. through a series of link down
>> and link up events. We  could  defer  the  clearing of
>> avail_index  to  until  the  device  is to be started,
>> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
>> set_status().
>>
>> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index 
>> after change map")
>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
>> ---
>>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
>> b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> index aa6f8cd..444ab58 100644
>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct 
>> vdpa_device *vdev, u8 status)
>>       if (!status) {
>>           mlx5_vdpa_info(mvdev, "performing device reset\n");
>>           teardown_driver(ndev);
>> -        clear_virtqueues(ndev);
>>           mlx5_vdpa_destroy_mr(&ndev->mvdev);
>>           ndev->mvdev.status = 0;
>>           ++mvdev->generation;
>> @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct 
>> vdpa_device *vdev, u8 status)
>>         if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
>>           if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
>> +            clear_virtqueues(ndev);
>
>
> Rethink about this. As mentioned in another thread, this in fact 
> breaks set_vq_state().  (See vhost_virtqueue_start() -> 
> vhost_vdpa_set_vring_base() in qemu codes).
I assume that the clearing for vhost-vdpa would be done via (qemu code),

vhost_dev_start()->vhost_vdpa_dev_start()->vhost_vdpa_call(status | 
VIRTIO_CONFIG_S_DRIVER_OK)

which is _after_ vhost_virtqueue_start() gets called to restore the 
avail_idx to h/w in vhost_dev_start(). What am I missing here?

-Siwei


>
> The issue is that the avail idx is forgot, we need keep it.
>
> Thanks
>
>
>>               err = setup_driver(ndev);
>>               if (err) {
>>                   mlx5_vdpa_warn(mvdev, "failed to setup driver\n");
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
@ 2021-02-10  0:26       ` Si-Wei Liu
  0 siblings, 0 replies; 27+ messages in thread
From: Si-Wei Liu @ 2021-02-10  0:26 UTC (permalink / raw)
  To: Jason Wang, mst, elic; +Cc: netdev, linux-kernel, virtualization



On 2/8/2021 7:37 PM, Jason Wang wrote:
>
> On 2021/2/6 下午8:29, Si-Wei Liu wrote:
>> While virtq is stopped,  get_vq_state() is supposed to
>> be  called to  get  sync'ed  with  the latest internal
>> avail_index from device. The saved avail_index is used
>> to restate  the virtq  once device is started.  Commit
>> b35ccebe3ef7 introduced the clear_virtqueues() routine
>> to  reset  the saved  avail_index,  however, the index
>> gets cleared a bit earlier before get_vq_state() tries
>> to read it. This would cause consistency problems when
>> virtq is restarted, e.g. through a series of link down
>> and link up events. We  could  defer  the  clearing of
>> avail_index  to  until  the  device  is to be started,
>> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
>> set_status().
>>
>> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index 
>> after change map")
>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
>> ---
>>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
>> b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> index aa6f8cd..444ab58 100644
>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct 
>> vdpa_device *vdev, u8 status)
>>       if (!status) {
>>           mlx5_vdpa_info(mvdev, "performing device reset\n");
>>           teardown_driver(ndev);
>> -        clear_virtqueues(ndev);
>>           mlx5_vdpa_destroy_mr(&ndev->mvdev);
>>           ndev->mvdev.status = 0;
>>           ++mvdev->generation;
>> @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct 
>> vdpa_device *vdev, u8 status)
>>         if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
>>           if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
>> +            clear_virtqueues(ndev);
>
>
> Rethink about this. As mentioned in another thread, this in fact 
> breaks set_vq_state().  (See vhost_virtqueue_start() -> 
> vhost_vdpa_set_vring_base() in qemu codes).
I assume that the clearing for vhost-vdpa would be done via (qemu code),

vhost_dev_start()->vhost_vdpa_dev_start()->vhost_vdpa_call(status | 
VIRTIO_CONFIG_S_DRIVER_OK)

which is _after_ vhost_virtqueue_start() gets called to restore the 
avail_idx to h/w in vhost_dev_start(). What am I missing here?

-Siwei


>
> The issue is that the avail idx is forgot, we need keep it.
>
> Thanks
>
>
>>               err = setup_driver(ndev);
>>               if (err) {
>>                   mlx5_vdpa_warn(mvdev, "failed to setup driver\n");
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
  2021-02-10  0:26       ` Si-Wei Liu
@ 2021-02-10  4:00         ` Jason Wang
  -1 siblings, 0 replies; 27+ messages in thread
From: Jason Wang @ 2021-02-10  4:00 UTC (permalink / raw)
  To: Si-Wei Liu, mst, elic; +Cc: linux-kernel, virtualization, netdev


On 2021/2/10 上午8:26, Si-Wei Liu wrote:
>
>
> On 2/8/2021 7:37 PM, Jason Wang wrote:
>>
>> On 2021/2/6 下午8:29, Si-Wei Liu wrote:
>>> While virtq is stopped,  get_vq_state() is supposed to
>>> be  called to  get  sync'ed  with  the latest internal
>>> avail_index from device. The saved avail_index is used
>>> to restate  the virtq  once device is started.  Commit
>>> b35ccebe3ef7 introduced the clear_virtqueues() routine
>>> to  reset  the saved  avail_index,  however, the index
>>> gets cleared a bit earlier before get_vq_state() tries
>>> to read it. This would cause consistency problems when
>>> virtq is restarted, e.g. through a series of link down
>>> and link up events. We  could  defer  the  clearing of
>>> avail_index  to  until  the  device  is to be started,
>>> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
>>> set_status().
>>>
>>> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index 
>>> after change map")
>>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
>>> ---
>>>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
>>> b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>>> index aa6f8cd..444ab58 100644
>>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>>> @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct 
>>> vdpa_device *vdev, u8 status)
>>>       if (!status) {
>>>           mlx5_vdpa_info(mvdev, "performing device reset\n");
>>>           teardown_driver(ndev);
>>> -        clear_virtqueues(ndev);
>>>           mlx5_vdpa_destroy_mr(&ndev->mvdev);
>>>           ndev->mvdev.status = 0;
>>>           ++mvdev->generation;
>>> @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct 
>>> vdpa_device *vdev, u8 status)
>>>         if ((status ^ ndev->mvdev.status) & 
>>> VIRTIO_CONFIG_S_DRIVER_OK) {
>>>           if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
>>> +            clear_virtqueues(ndev);
>>
>>
>> Rethink about this. As mentioned in another thread, this in fact 
>> breaks set_vq_state().  (See vhost_virtqueue_start() -> 
>> vhost_vdpa_set_vring_base() in qemu codes).
> I assume that the clearing for vhost-vdpa would be done via (qemu code),
>
> vhost_dev_start()->vhost_vdpa_dev_start()->vhost_vdpa_call(status | 
> VIRTIO_CONFIG_S_DRIVER_OK)
>
> which is _after_ vhost_virtqueue_start() gets called to restore the 
> avail_idx to h/w in vhost_dev_start(). What am I missing here?
>
> -Siwei


I think not. I thought clear_virtqueues() will clear hardware index but 
looks not. (I guess we need a better name other than clear_virtqueues(), 
e.g from the name it looks like the it will clear the hardware states)

Thanks


>
>
>>
>> The issue is that the avail idx is forgot, we need keep it.
>>
>> Thanks
>>
>>
>>>               err = setup_driver(ndev);
>>>               if (err) {
>>>                   mlx5_vdpa_warn(mvdev, "failed to setup driver\n");
>>
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
@ 2021-02-10  4:00         ` Jason Wang
  0 siblings, 0 replies; 27+ messages in thread
From: Jason Wang @ 2021-02-10  4:00 UTC (permalink / raw)
  To: Si-Wei Liu, mst, elic; +Cc: netdev, linux-kernel, virtualization


On 2021/2/10 上午8:26, Si-Wei Liu wrote:
>
>
> On 2/8/2021 7:37 PM, Jason Wang wrote:
>>
>> On 2021/2/6 下午8:29, Si-Wei Liu wrote:
>>> While virtq is stopped,  get_vq_state() is supposed to
>>> be  called to  get  sync'ed  with  the latest internal
>>> avail_index from device. The saved avail_index is used
>>> to restate  the virtq  once device is started.  Commit
>>> b35ccebe3ef7 introduced the clear_virtqueues() routine
>>> to  reset  the saved  avail_index,  however, the index
>>> gets cleared a bit earlier before get_vq_state() tries
>>> to read it. This would cause consistency problems when
>>> virtq is restarted, e.g. through a series of link down
>>> and link up events. We  could  defer  the  clearing of
>>> avail_index  to  until  the  device  is to be started,
>>> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
>>> set_status().
>>>
>>> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index 
>>> after change map")
>>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
>>> ---
>>>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
>>> b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>>> index aa6f8cd..444ab58 100644
>>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>>> @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct 
>>> vdpa_device *vdev, u8 status)
>>>       if (!status) {
>>>           mlx5_vdpa_info(mvdev, "performing device reset\n");
>>>           teardown_driver(ndev);
>>> -        clear_virtqueues(ndev);
>>>           mlx5_vdpa_destroy_mr(&ndev->mvdev);
>>>           ndev->mvdev.status = 0;
>>>           ++mvdev->generation;
>>> @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct 
>>> vdpa_device *vdev, u8 status)
>>>         if ((status ^ ndev->mvdev.status) & 
>>> VIRTIO_CONFIG_S_DRIVER_OK) {
>>>           if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
>>> +            clear_virtqueues(ndev);
>>
>>
>> Rethink about this. As mentioned in another thread, this in fact 
>> breaks set_vq_state().  (See vhost_virtqueue_start() -> 
>> vhost_vdpa_set_vring_base() in qemu codes).
> I assume that the clearing for vhost-vdpa would be done via (qemu code),
>
> vhost_dev_start()->vhost_vdpa_dev_start()->vhost_vdpa_call(status | 
> VIRTIO_CONFIG_S_DRIVER_OK)
>
> which is _after_ vhost_virtqueue_start() gets called to restore the 
> avail_idx to h/w in vhost_dev_start(). What am I missing here?
>
> -Siwei


I think not. I thought clear_virtqueues() will clear hardware index but 
looks not. (I guess we need a better name other than clear_virtqueues(), 
e.g from the name it looks like the it will clear the hardware states)

Thanks


>
>
>>
>> The issue is that the avail idx is forgot, we need keep it.
>>
>> Thanks
>>
>>
>>>               err = setup_driver(ndev);
>>>               if (err) {
>>>                   mlx5_vdpa_warn(mvdev, "failed to setup driver\n");
>>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] mlx5_vdpa: fix feature negotiation across device reset
  2021-02-09  1:20       ` Si-Wei Liu
@ 2021-02-10 12:28         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2021-02-10 12:28 UTC (permalink / raw)
  To: Si-Wei Liu; +Cc: Eli Cohen, jasowang, linux-kernel, virtualization, netdev

On Mon, Feb 08, 2021 at 05:20:11PM -0800, Si-Wei Liu wrote:
> 
> 
> On 2/7/2021 9:35 PM, Eli Cohen wrote:
> > On Sat, Feb 06, 2021 at 04:29:23AM -0800, Si-Wei Liu wrote:
> > > The mlx_features denotes the capability for which
> > > set of virtio features is supported by device. In
> > > principle, this field needs not be cleared during
> > > virtio device reset, as this capability is static
> > > and does not change across reset.
> > > 
> > > In fact, the current code may have the assumption
> > > that mlx_features can be reloaded from firmware
> > > via the .get_features ops after device is reset
> > > (via the .set_status ops), which is unfortunately
> > > not true. The userspace VMM might save a copy
> > > of backend capable features and won't call into
> > > kernel again to get it on reset. This causes all
> > > virtio features getting disabled on newly created
> > > virtqs after device reset, while guest would hold
> > > mismatched view of available features. For e.g.,
> > > the guest may still assume tx checksum offload
> > > is available after reset and feature negotiation,
> > > causing frames with bogus (incomplete) checksum
> > > transmitted on the wire.
> > > 
> > > Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> > > ---
> > >   drivers/vdpa/mlx5/net/mlx5_vnet.c | 1 -
> > >   1 file changed, 1 deletion(-)
> > > 
> > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > index b8416c4..aa6f8cd 100644
> > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > @@ -1788,7 +1788,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
> > >   		clear_virtqueues(ndev);
> > >   		mlx5_vdpa_destroy_mr(&ndev->mvdev);
> > >   		ndev->mvdev.status = 0;
> > > -		ndev->mvdev.mlx_features = 0;
> > >   		++mvdev->generation;
> > >   		return;
> > >   	}
> > Since we assume that device capabilities don't change, I think I would
> > get the features through a call done in mlx5v_probe after the netdev
> > object is created and change mlx5_vdpa_get_features() to just return
> > ndev->mvdev.mlx_features.
> Yep, it makes sense. Will post a revised patch.

So I'm waiting for v2 of this patchset. Please make sure to post a cover letter
with an overall description.

> If vdpa tool allows
> reconfiguration post probing, the code has to be reconciled then.
> 
> > 
> > Did you actually see this issue in action? If you did, can you share
> > with us how you trigerred this?
> Issue is indeed seen in action. The mismatched tx-checksum offload as
> described in the commit message was one of such examples. You would need a
> guest reboot though (triggering device reset via the .set_status ops and
> zero'ed mlx_features was loaded to deduce new actual_features for creating
> the h/w virtq object) for repro.
> 
> -Siwei
> > 
> > > -- 
> > > 1.8.3.1
> > > 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] mlx5_vdpa: fix feature negotiation across device reset
@ 2021-02-10 12:28         ` Michael S. Tsirkin
  0 siblings, 0 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2021-02-10 12:28 UTC (permalink / raw)
  To: Si-Wei Liu; +Cc: netdev, Eli Cohen, virtualization, linux-kernel

On Mon, Feb 08, 2021 at 05:20:11PM -0800, Si-Wei Liu wrote:
> 
> 
> On 2/7/2021 9:35 PM, Eli Cohen wrote:
> > On Sat, Feb 06, 2021 at 04:29:23AM -0800, Si-Wei Liu wrote:
> > > The mlx_features denotes the capability for which
> > > set of virtio features is supported by device. In
> > > principle, this field needs not be cleared during
> > > virtio device reset, as this capability is static
> > > and does not change across reset.
> > > 
> > > In fact, the current code may have the assumption
> > > that mlx_features can be reloaded from firmware
> > > via the .get_features ops after device is reset
> > > (via the .set_status ops), which is unfortunately
> > > not true. The userspace VMM might save a copy
> > > of backend capable features and won't call into
> > > kernel again to get it on reset. This causes all
> > > virtio features getting disabled on newly created
> > > virtqs after device reset, while guest would hold
> > > mismatched view of available features. For e.g.,
> > > the guest may still assume tx checksum offload
> > > is available after reset and feature negotiation,
> > > causing frames with bogus (incomplete) checksum
> > > transmitted on the wire.
> > > 
> > > Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> > > ---
> > >   drivers/vdpa/mlx5/net/mlx5_vnet.c | 1 -
> > >   1 file changed, 1 deletion(-)
> > > 
> > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > index b8416c4..aa6f8cd 100644
> > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > @@ -1788,7 +1788,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
> > >   		clear_virtqueues(ndev);
> > >   		mlx5_vdpa_destroy_mr(&ndev->mvdev);
> > >   		ndev->mvdev.status = 0;
> > > -		ndev->mvdev.mlx_features = 0;
> > >   		++mvdev->generation;
> > >   		return;
> > >   	}
> > Since we assume that device capabilities don't change, I think I would
> > get the features through a call done in mlx5v_probe after the netdev
> > object is created and change mlx5_vdpa_get_features() to just return
> > ndev->mvdev.mlx_features.
> Yep, it makes sense. Will post a revised patch.

So I'm waiting for v2 of this patchset. Please make sure to post a cover letter
with an overall description.

> If vdpa tool allows
> reconfiguration post probing, the code has to be reconciled then.
> 
> > 
> > Did you actually see this issue in action? If you did, can you share
> > with us how you trigerred this?
> Issue is indeed seen in action. The mismatched tx-checksum offload as
> described in the commit message was one of such examples. You would need a
> guest reboot though (triggering device reset via the .set_status ops and
> zero'ed mlx_features was loaded to deduce new actual_features for creating
> the h/w virtq object) for repro.
> 
> -Siwei
> > 
> > > -- 
> > > 1.8.3.1
> > > 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-02-10 12:33 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-06 12:29 [PATCH 1/3] mlx5_vdpa: should exclude header length and fcs from mtu Si-Wei Liu
2021-02-06 12:29 ` Si-Wei Liu
2021-02-06 12:29 ` [PATCH 2/3] mlx5_vdpa: fix feature negotiation across device reset Si-Wei Liu
2021-02-06 12:29   ` Si-Wei Liu
2021-02-08  4:37   ` Jason Wang
2021-02-08  4:37     ` Jason Wang
2021-02-08  5:35   ` Eli Cohen
2021-02-09  1:20     ` Si-Wei Liu
2021-02-09  1:20       ` Si-Wei Liu
2021-02-10 12:28       ` Michael S. Tsirkin
2021-02-10 12:28         ` Michael S. Tsirkin
2021-02-06 12:29 ` [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK Si-Wei Liu
2021-02-06 12:29   ` Si-Wei Liu
2021-02-08  4:38   ` Jason Wang
2021-02-08  4:38     ` Jason Wang
2021-02-08  5:48   ` Eli Cohen
2021-02-09  1:40     ` Si-Wei Liu
2021-02-09  1:40       ` Si-Wei Liu
2021-02-09  3:37   ` Jason Wang
2021-02-09  3:37     ` Jason Wang
2021-02-10  0:26     ` Si-Wei Liu
2021-02-10  0:26       ` Si-Wei Liu
2021-02-10  4:00       ` Jason Wang
2021-02-10  4:00         ` Jason Wang
2021-02-08  4:38 ` [PATCH 1/3] mlx5_vdpa: should exclude header length and fcs from mtu Jason Wang
2021-02-08  4:38   ` Jason Wang
2021-02-08  5:35 ` Eli Cohen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.