netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 0/6] ifcvf/vDPA: support query device config space through netlink
@ 2022-07-01 13:28 Zhu Lingshan
  2022-07-01 13:28 ` [PATCH V3 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation Zhu Lingshan
                   ` (5 more replies)
  0 siblings, 6 replies; 113+ messages in thread
From: Zhu Lingshan @ 2022-07-01 13:28 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, parav, xieyongji, gautam.dawar, Zhu Lingshan

This series allows userspace to query device config space of vDPA
devices and the management devices through netlink,
to get multi-queue, feature bits

This series has introduced a new netlink attr
VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, this should be used to query
features of vDPA  devices than the management device.

Please help review.

Thanks!
Zhu Lingshan

Changes from V2:
Add fixes tags(Parva)

Changes from V1:
(1) Use __virito16_to_cpu(true, xxx) for the le16 casting(Jason)
(2) Add a comment in ifcvf_get_config_size(), to explain
why we should return the minimum value of
sizeof(struct virtio_net_config) and the onboard
cap size(Jason)
(3) Introduced a new attr VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES
(4) Show the changes of iproute2 output before and after 5/6 patch(Jason)
(5) Fix cast warning in vdpa_fill_stats_rec() 

Zhu Lingshan (6):
  vDPA/ifcvf: get_config_size should return a value no greater than dev
    implementation
  vDPA/ifcvf: support userspace to query features and MQ of a management
    device
  vDPA: allow userspace to query features of a vDPA device
  vDPA: !FEATURES_OK should not block querying device config space
  vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ
    == 0
  vDPA: fix 'cast to restricted le16' warnings in vdpa.c

 drivers/vdpa/ifcvf/ifcvf_base.c | 25 +++++++++++++++++++++++--
 drivers/vdpa/ifcvf/ifcvf_base.h |  3 +++
 drivers/vdpa/ifcvf/ifcvf_main.c |  3 +++
 drivers/vdpa/vdpa.c             | 32 +++++++++++++++-----------------
 include/uapi/linux/vdpa.h       |  1 +
 5 files changed, 45 insertions(+), 19 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH V3 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation
  2022-07-01 13:28 [PATCH V3 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
@ 2022-07-01 13:28 ` Zhu Lingshan
  2022-07-04  4:39   ` Jason Wang
  2022-07-13  5:31   ` Michael S. Tsirkin
  2022-07-01 13:28 ` [PATCH V3 2/6] vDPA/ifcvf: support userspace to query features and MQ of a management device Zhu Lingshan
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 113+ messages in thread
From: Zhu Lingshan @ 2022-07-01 13:28 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, parav, xieyongji, gautam.dawar, Zhu Lingshan

ifcvf_get_config_size() should return a virtio device type specific value,
however the ret_value should not be greater than the onboard size of
the device implementation. E.g., for virtio_net, config_size should be
the minimum value of sizeof(struct virtio_net_config) and the onboard
cap size.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
 drivers/vdpa/ifcvf/ifcvf_base.c | 13 +++++++++++--
 drivers/vdpa/ifcvf/ifcvf_base.h |  2 ++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
index 48c4dadb0c7c..fb957b57941e 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.c
+++ b/drivers/vdpa/ifcvf/ifcvf_base.c
@@ -128,6 +128,7 @@ int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *pdev)
 			break;
 		case VIRTIO_PCI_CAP_DEVICE_CFG:
 			hw->dev_cfg = get_cap_addr(hw, &cap);
+			hw->cap_dev_config_size = le32_to_cpu(cap.length);
 			IFCVF_DBG(pdev, "hw->dev_cfg = %p\n", hw->dev_cfg);
 			break;
 		}
@@ -233,15 +234,23 @@ int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features)
 u32 ifcvf_get_config_size(struct ifcvf_hw *hw)
 {
 	struct ifcvf_adapter *adapter;
+	u32 net_config_size = sizeof(struct virtio_net_config);
+	u32 blk_config_size = sizeof(struct virtio_blk_config);
+	u32 cap_size = hw->cap_dev_config_size;
 	u32 config_size;
 
 	adapter = vf_to_adapter(hw);
+	/* If the onboard device config space size is greater than
+	 * the size of struct virtio_net/blk_config, only the spec
+	 * implementing contents size is returned, this is very
+	 * unlikely, defensive programming.
+	 */
 	switch (hw->dev_type) {
 	case VIRTIO_ID_NET:
-		config_size = sizeof(struct virtio_net_config);
+		config_size = cap_size >= net_config_size ? net_config_size : cap_size;
 		break;
 	case VIRTIO_ID_BLOCK:
-		config_size = sizeof(struct virtio_blk_config);
+		config_size = cap_size >= blk_config_size ? blk_config_size : cap_size;
 		break;
 	default:
 		config_size = 0;
diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
index 115b61f4924b..f5563f665cc6 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.h
+++ b/drivers/vdpa/ifcvf/ifcvf_base.h
@@ -87,6 +87,8 @@ struct ifcvf_hw {
 	int config_irq;
 	int vqs_reused_irq;
 	u16 nr_vring;
+	/* VIRTIO_PCI_CAP_DEVICE_CFG size */
+	u32 cap_dev_config_size;
 };
 
 struct ifcvf_adapter {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH V3 2/6] vDPA/ifcvf: support userspace to query features and MQ of a management device
  2022-07-01 13:28 [PATCH V3 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
  2022-07-01 13:28 ` [PATCH V3 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation Zhu Lingshan
@ 2022-07-01 13:28 ` Zhu Lingshan
  2022-07-04  4:43   ` Jason Wang
  2022-07-01 13:28 ` [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device Zhu Lingshan
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 113+ messages in thread
From: Zhu Lingshan @ 2022-07-01 13:28 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, parav, xieyongji, gautam.dawar, Zhu Lingshan

Adapting to current netlink interfaces, this commit allows userspace
to query feature bits and MQ capability of a management device.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
 drivers/vdpa/ifcvf/ifcvf_base.c | 12 ++++++++++++
 drivers/vdpa/ifcvf/ifcvf_base.h |  1 +
 drivers/vdpa/ifcvf/ifcvf_main.c |  3 +++
 3 files changed, 16 insertions(+)

diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
index fb957b57941e..7c5f1cc93ad9 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.c
+++ b/drivers/vdpa/ifcvf/ifcvf_base.c
@@ -346,6 +346,18 @@ int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u16 num)
 	return 0;
 }
 
+u16 ifcvf_get_max_vq_pairs(struct ifcvf_hw *hw)
+{
+	struct virtio_net_config __iomem *config;
+	u16 val, mq;
+
+	config = hw->dev_cfg;
+	val = vp_ioread16((__le16 __iomem *)&config->max_virtqueue_pairs);
+	mq = le16_to_cpu((__force __le16)val);
+
+	return mq;
+}
+
 static int ifcvf_hw_enable(struct ifcvf_hw *hw)
 {
 	struct virtio_pci_common_cfg __iomem *cfg;
diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
index f5563f665cc6..d54a1bed212e 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.h
+++ b/drivers/vdpa/ifcvf/ifcvf_base.h
@@ -130,6 +130,7 @@ u64 ifcvf_get_hw_features(struct ifcvf_hw *hw);
 int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features);
 u16 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid);
 int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u16 num);
+u16 ifcvf_get_max_vq_pairs(struct ifcvf_hw *hw);
 struct ifcvf_adapter *vf_to_adapter(struct ifcvf_hw *hw);
 int ifcvf_probed_virtio_net(struct ifcvf_hw *hw);
 u32 ifcvf_get_config_size(struct ifcvf_hw *hw);
diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index 0a5670729412..3ff7096d30f1 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -791,6 +791,9 @@ static int ifcvf_vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
 	vf->hw_features = ifcvf_get_hw_features(vf);
 	vf->config_size = ifcvf_get_config_size(vf);
 
+	ifcvf_mgmt_dev->mdev.max_supported_vqs = ifcvf_get_max_vq_pairs(vf);
+	ifcvf_mgmt_dev->mdev.supported_features = vf->hw_features;
+
 	adapter->vdpa.mdev = &ifcvf_mgmt_dev->mdev;
 	ret = _vdpa_register_device(&adapter->vdpa, vf->nr_vring);
 	if (ret) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-01 13:28 [PATCH V3 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
  2022-07-01 13:28 ` [PATCH V3 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation Zhu Lingshan
  2022-07-01 13:28 ` [PATCH V3 2/6] vDPA/ifcvf: support userspace to query features and MQ of a management device Zhu Lingshan
@ 2022-07-01 13:28 ` Zhu Lingshan
  2022-07-01 22:02   ` Parav Pandit
  2022-07-01 13:28 ` [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space Zhu Lingshan
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 113+ messages in thread
From: Zhu Lingshan @ 2022-07-01 13:28 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, parav, xieyongji, gautam.dawar, Zhu Lingshan

This commit adds a new vDPA netlink attribution
VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
features of vDPA devices through this new attr.

Fixes: a64917bc2e9b vdpa: (Provide interface to read driver feature)
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
 drivers/vdpa/vdpa.c       | 13 +++++++++----
 include/uapi/linux/vdpa.h |  1 +
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index ebf2f363fbe7..9b0e39b2f022 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -815,7 +815,7 @@ static int vdpa_dev_net_mq_config_fill(struct vdpa_device *vdev,
 static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *msg)
 {
 	struct virtio_net_config config = {};
-	u64 features;
+	u64 features_device, features_driver;
 	u16 val_u16;
 
 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
@@ -832,12 +832,17 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
 		return -EMSGSIZE;
 
-	features = vdev->config->get_driver_features(vdev);
-	if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
+	features_driver = vdev->config->get_driver_features(vdev);
+	if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features_driver,
+			      VDPA_ATTR_PAD))
+		return -EMSGSIZE;
+
+	features_device = vdev->config->get_device_features(vdev);
+	if (nla_put_u64_64bit(msg, VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, features_device,
 			      VDPA_ATTR_PAD))
 		return -EMSGSIZE;
 
-	return vdpa_dev_net_mq_config_fill(vdev, msg, features, &config);
+	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver, &config);
 }
 
 static int
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index 25c55cab3d7c..39f1c3d7c112 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -47,6 +47,7 @@ enum vdpa_attr {
 	VDPA_ATTR_DEV_NEGOTIATED_FEATURES,	/* u64 */
 	VDPA_ATTR_DEV_MGMTDEV_MAX_VQS,		/* u32 */
 	VDPA_ATTR_DEV_SUPPORTED_FEATURES,	/* u64 */
+	VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES,	/* u64 */
 
 	VDPA_ATTR_DEV_QUEUE_INDEX,              /* u32 */
 	VDPA_ATTR_DEV_VENDOR_ATTR_NAME,		/* string */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-07-01 13:28 [PATCH V3 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
                   ` (2 preceding siblings ...)
  2022-07-01 13:28 ` [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device Zhu Lingshan
@ 2022-07-01 13:28 ` Zhu Lingshan
  2022-07-01 22:12   ` Parav Pandit
  2022-07-01 13:28 ` [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0 Zhu Lingshan
  2022-07-01 13:28 ` [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c Zhu Lingshan
  5 siblings, 1 reply; 113+ messages in thread
From: Zhu Lingshan @ 2022-07-01 13:28 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, parav, xieyongji, gautam.dawar, Zhu Lingshan

Users may want to query the config space of a vDPA device,
to choose a appropriate one for a certain guest. This means the
users need to read the config space before FEATURES_OK, and
the existence of config space contents does not depend on
FEATURES_OK.

The spec says:
The device MUST allow reading of any device-specific configuration
field before FEATURES_OK is set by the driver. This includes
fields which are conditional on feature bits, as long as those
feature bits are offered by the device.

Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
 drivers/vdpa/vdpa.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 9b0e39b2f022..d76b22b2f7ae 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid,
 {
 	u32 device_id;
 	void *hdr;
-	u8 status;
 	int err;
 
 	down_read(&vdev->cf_lock);
-	status = vdev->config->get_status(vdev);
-	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
-		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not completed");
-		err = -EAGAIN;
-		goto out;
-	}
-
 	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
 			  VDPA_CMD_DEV_CONFIG_GET);
 	if (!hdr) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-01 13:28 [PATCH V3 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
                   ` (3 preceding siblings ...)
  2022-07-01 13:28 ` [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space Zhu Lingshan
@ 2022-07-01 13:28 ` Zhu Lingshan
  2022-07-01 22:07   ` Parav Pandit
  2022-07-01 13:28 ` [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c Zhu Lingshan
  5 siblings, 1 reply; 113+ messages in thread
From: Zhu Lingshan @ 2022-07-01 13:28 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, parav, xieyongji, gautam.dawar, Zhu Lingshan

If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue pair,
so when userspace querying queue pair numbers, it should return mq=1
than zero.

Function vdpa_dev_net_config_fill() fills the attributions of the
vDPA devices, so that it should call vdpa_dev_net_mq_config_fill()
so the parameter in vdpa_dev_net_mq_config_fill()
should be feature_device than feature_driver for the
vDPA devices themselves

Before this change, when MQ = 0, iproute2 output:
$vdpa dev config show vdpa0
vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
max_vq_pairs 0 mtu 1500

After applying this commit, when MQ = 0, iproute2 output:
$vdpa dev config show vdpa0
vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
max_vq_pairs 1 mtu 1500

Fixes: a64917bc2e9b (vdpa: Provide interface to read driver features)
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
 drivers/vdpa/vdpa.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index d76b22b2f7ae..846dd37f3549 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -806,9 +806,10 @@ static int vdpa_dev_net_mq_config_fill(struct vdpa_device *vdev,
 	u16 val_u16;
 
 	if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0)
-		return 0;
+		val_u16 = 1;
+	else
+		val_u16 = __virtio16_to_cpu(true, config->max_virtqueue_pairs);
 
-	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
 	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, val_u16);
 }
 
@@ -842,7 +843,7 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
 			      VDPA_ATTR_PAD))
 		return -EMSGSIZE;
 
-	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver, &config);
+	return vdpa_dev_net_mq_config_fill(vdev, msg, features_device, &config);
 }
 
 static int
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-01 13:28 [PATCH V3 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
                   ` (4 preceding siblings ...)
  2022-07-01 13:28 ` [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0 Zhu Lingshan
@ 2022-07-01 13:28 ` Zhu Lingshan
  2022-07-01 22:18   ` Parav Pandit
  2022-07-29  8:53   ` Michael S. Tsirkin
  5 siblings, 2 replies; 113+ messages in thread
From: Zhu Lingshan @ 2022-07-01 13:28 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, parav, xieyongji, gautam.dawar, Zhu Lingshan

This commit fixes spars warnings: cast to restricted __le16
in function vdpa_dev_net_config_fill() and
vdpa_fill_stats_rec()

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
 drivers/vdpa/vdpa.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 846dd37f3549..ed49fe46a79e 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
 		    config.mac))
 		return -EMSGSIZE;
 
-	val_u16 = le16_to_cpu(config.status);
+	val_u16 = __virtio16_to_cpu(true, config.status);
 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
 		return -EMSGSIZE;
 
-	val_u16 = le16_to_cpu(config.mtu);
+	val_u16 = __virtio16_to_cpu(true, config.mtu);
 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
 		return -EMSGSIZE;
 
@@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
 	}
 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
 
-	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
+	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
 		return -EMSGSIZE;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-01 13:28 ` [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device Zhu Lingshan
@ 2022-07-01 22:02   ` Parav Pandit
  2022-07-04  4:46     ` Jason Wang
  2022-07-08  6:16     ` Zhu, Lingshan
  0 siblings, 2 replies; 113+ messages in thread
From: Parav Pandit @ 2022-07-01 22:02 UTC (permalink / raw)
  To: Zhu Lingshan, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



> From: Zhu Lingshan <lingshan.zhu@intel.com>
> Sent: Friday, July 1, 2022 9:28 AM
> 
> This commit adds a new vDPA netlink attribution
> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
> features of vDPA devices through this new attr.
> 
> Fixes: a64917bc2e9b vdpa: (Provide interface to read driver feature)
Missing the "" in the line.
I reviewed the patches again.

However, this is not the fix.
A fix cannot add a new UAPI.

Code is already considering negotiated driver features to return the device config space.
Hence it is fine.

This patch intents to provide device features to user space.
First what vdpa device are capable of, are already returned by features attribute on the management device.
This is done in commit [1].

The only reason to have it is, when one management device indicates that feature is supported, but device may end up not supporting this feature if such feature is shared with other devices on same physical device.
For example all VFs may not be symmetric after large number of them are in use. In such case features bit of management device can differ (more features) than the vdpa device of this VF.
Hence, showing on the device is useful.

As mentioned before in V2, commit [1] has wrongly named the attribute to VDPA_ATTR_DEV_SUPPORTED_FEATURES.
It should have been, VDPA_ATTR_DEV_MGMTDEV_SUPPORTED_FEATURES.
Because it is in UAPI, and since we don't want to break compilation of iproute2,
It cannot be renamed anymore.

Given that, we do not want to start trend of naming device attributes with additional _VDPA_ to it as done in this patch.
Error in commit [1] was exception.

Hence, please reuse VDPA_ATTR_DEV_SUPPORTED_FEATURES to return for device features too.

Secondly, you need output example for showing device features in the commit log.

3rd, please drop the fixes tag as new capability is not a fix.

[1] cd2629f6df1c ("vdpa: Support reporting max device capabilities ")

> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> ---
>  drivers/vdpa/vdpa.c       | 13 +++++++++----
>  include/uapi/linux/vdpa.h |  1 +
>  2 files changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> ebf2f363fbe7..9b0e39b2f022 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -815,7 +815,7 @@ static int vdpa_dev_net_mq_config_fill(struct
> vdpa_device *vdev,  static int vdpa_dev_net_config_fill(struct vdpa_device
> *vdev, struct sk_buff *msg)  {
>  	struct virtio_net_config config = {};
> -	u64 features;
> +	u64 features_device, features_driver;
>  	u16 val_u16;
> 
>  	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config)); @@ -
> 832,12 +832,17 @@ static int vdpa_dev_net_config_fill(struct vdpa_device
> *vdev, struct sk_buff *ms
>  	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>  		return -EMSGSIZE;
> 
> -	features = vdev->config->get_driver_features(vdev);
> -	if (nla_put_u64_64bit(msg,
> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
> +	features_driver = vdev->config->get_driver_features(vdev);
> +	if (nla_put_u64_64bit(msg,
> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features_driver,
> +			      VDPA_ATTR_PAD))
> +		return -EMSGSIZE;
> +
> +	features_device = vdev->config->get_device_features(vdev);
> +	if (nla_put_u64_64bit(msg,
> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES,
> +features_device,
>  			      VDPA_ATTR_PAD))
>  		return -EMSGSIZE;
> 
> -	return vdpa_dev_net_mq_config_fill(vdev, msg, features, &config);
> +	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver,
> +&config);
>  }
> 
>  static int
> diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h index
> 25c55cab3d7c..39f1c3d7c112 100644
> --- a/include/uapi/linux/vdpa.h
> +++ b/include/uapi/linux/vdpa.h
> @@ -47,6 +47,7 @@ enum vdpa_attr {
>  	VDPA_ATTR_DEV_NEGOTIATED_FEATURES,	/* u64 */
>  	VDPA_ATTR_DEV_MGMTDEV_MAX_VQS,		/* u32 */
>  	VDPA_ATTR_DEV_SUPPORTED_FEATURES,	/* u64 */
> +	VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES,	/* u64 */
> 
>  	VDPA_ATTR_DEV_QUEUE_INDEX,              /* u32 */
>  	VDPA_ATTR_DEV_VENDOR_ATTR_NAME,		/* string */
> --
> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-01 13:28 ` [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0 Zhu Lingshan
@ 2022-07-01 22:07   ` Parav Pandit
  2022-07-08  6:21     ` Zhu, Lingshan
  2022-07-13  5:26     ` Michael S. Tsirkin
  0 siblings, 2 replies; 113+ messages in thread
From: Parav Pandit @ 2022-07-01 22:07 UTC (permalink / raw)
  To: Zhu Lingshan, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



> From: Zhu Lingshan <lingshan.zhu@intel.com>
> Sent: Friday, July 1, 2022 9:28 AM
> If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue pair, so
> when userspace querying queue pair numbers, it should return mq=1 than
> zero.
> 
> Function vdpa_dev_net_config_fill() fills the attributions of the vDPA
> devices, so that it should call vdpa_dev_net_mq_config_fill() so the
> parameter in vdpa_dev_net_mq_config_fill() should be feature_device than
> feature_driver for the vDPA devices themselves
> 
> Before this change, when MQ = 0, iproute2 output:
> $vdpa dev config show vdpa0
> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 0
> mtu 1500
>
The fix belongs to user space.
When a feature bit _MQ is not negotiated, vdpa kernel space will not add attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
When such attribute is not returned by kernel, max_vq_pairs should not be shown by the iproute2.

We have many config space fields that depend on the feature bits and some of them do not have any defaults.
To keep consistency of existence of config space fields among all, we don't want to show default like below.

Please fix the iproute2 to not print max_vq_pairs when it is not returned by the kernel.
 
> After applying this commit, when MQ = 0, iproute2 output:
> $vdpa dev config show vdpa0
> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 1
> mtu 1500
> 
> Fixes: a64917bc2e9b (vdpa: Provide interface to read driver features)
> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> ---
>  drivers/vdpa/vdpa.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> d76b22b2f7ae..846dd37f3549 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -806,9 +806,10 @@ static int vdpa_dev_net_mq_config_fill(struct
> vdpa_device *vdev,
>  	u16 val_u16;
> 
>  	if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0)
> -		return 0;
> +		val_u16 = 1;
> +	else
> +		val_u16 = __virtio16_to_cpu(true, config-
> >max_virtqueue_pairs);
> 
> -	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
>  	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
> val_u16);  }
> 
> @@ -842,7 +843,7 @@ static int vdpa_dev_net_config_fill(struct
> vdpa_device *vdev, struct sk_buff *ms
>  			      VDPA_ATTR_PAD))
>  		return -EMSGSIZE;
> 
> -	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver,
> &config);
> +	return vdpa_dev_net_mq_config_fill(vdev, msg, features_device,
> +&config);
>  }
> 
>  static int
> --
> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-07-01 13:28 ` [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space Zhu Lingshan
@ 2022-07-01 22:12   ` Parav Pandit
  2022-07-08  6:22     ` Zhu, Lingshan
                       ` (2 more replies)
  0 siblings, 3 replies; 113+ messages in thread
From: Parav Pandit @ 2022-07-01 22:12 UTC (permalink / raw)
  To: Zhu Lingshan, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



> From: Zhu Lingshan <lingshan.zhu@intel.com>
> Sent: Friday, July 1, 2022 9:28 AM
> 
> Users may want to query the config space of a vDPA device, to choose a
> appropriate one for a certain guest. This means the users need to read the
> config space before FEATURES_OK, and the existence of config space
> contents does not depend on FEATURES_OK.
> 
> The spec says:
> The device MUST allow reading of any device-specific configuration field
> before FEATURES_OK is set by the driver. This includes fields which are
> conditional on feature bits, as long as those feature bits are offered by the
> device.
> 
> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
Fix is fine, but fixes tag needs correction described below.

Above commit id is 13 letters should be 12.
And 
It should be in format
Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if FEATURES_OK")

Please use checkpatch.pl script before posting the patches to catch these errors.
There is a bot that looks at the fixes tag and identifies the right kernel version to apply this fix.

> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> ---
>  drivers/vdpa/vdpa.c | 8 --------
>  1 file changed, 8 deletions(-)
> 
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> 9b0e39b2f022..d76b22b2f7ae 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
> struct sk_buff *msg, u32 portid,  {
>  	u32 device_id;
>  	void *hdr;
> -	u8 status;
>  	int err;
> 
>  	down_read(&vdev->cf_lock);
> -	status = vdev->config->get_status(vdev);
> -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
> -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
> completed");
> -		err = -EAGAIN;
> -		goto out;
> -	}
> -
>  	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>  			  VDPA_CMD_DEV_CONFIG_GET);
>  	if (!hdr) {
> --
> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-01 13:28 ` [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c Zhu Lingshan
@ 2022-07-01 22:18   ` Parav Pandit
  2022-07-08  6:25     ` Zhu, Lingshan
  2022-07-29  8:53   ` Michael S. Tsirkin
  1 sibling, 1 reply; 113+ messages in thread
From: Parav Pandit @ 2022-07-01 22:18 UTC (permalink / raw)
  To: Zhu Lingshan, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


> From: Zhu Lingshan <lingshan.zhu@intel.com>
> Sent: Friday, July 1, 2022 9:28 AM
> 
> This commit fixes spars warnings: cast to restricted __le16 in function
> vdpa_dev_net_config_fill() and
> vdpa_fill_stats_rec()
>
Missing fixes tag.
 
But I fail to understand the warning.
config.status is le16, and API used is to convert le16 to cpu.
What is the warning about, can you please explain?

> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> ---
>  drivers/vdpa/vdpa.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> 846dd37f3549..ed49fe46a79e 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct
> vdpa_device *vdev, struct sk_buff *ms
>  		    config.mac))
>  		return -EMSGSIZE;
> 
> -	val_u16 = le16_to_cpu(config.status);
> +	val_u16 = __virtio16_to_cpu(true, config.status);
>  	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>  		return -EMSGSIZE;
> 
> -	val_u16 = le16_to_cpu(config.mtu);
> +	val_u16 = __virtio16_to_cpu(true, config.mtu);
>  	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>  		return -EMSGSIZE;
> 
> @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device
> *vdev, struct sk_buff *msg,
>  	}
>  	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> 
> -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
> +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
>  	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
> max_vqp))
>  		return -EMSGSIZE;
> 
> --
> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation
  2022-07-01 13:28 ` [PATCH V3 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation Zhu Lingshan
@ 2022-07-04  4:39   ` Jason Wang
  2022-07-08  6:44     ` Zhu, Lingshan
  2022-07-13  5:31   ` Michael S. Tsirkin
  1 sibling, 1 reply; 113+ messages in thread
From: Jason Wang @ 2022-07-04  4:39 UTC (permalink / raw)
  To: Zhu Lingshan
  Cc: mst, virtualization, netdev, Parav Pandit, Yongji Xie, Dawar, Gautam

On Fri, Jul 1, 2022 at 9:36 PM Zhu Lingshan <lingshan.zhu@intel.com> wrote:
>
> ifcvf_get_config_size() should return a virtio device type specific value,
> however the ret_value should not be greater than the onboard size of
> the device implementation. E.g., for virtio_net, config_size should be
> the minimum value of sizeof(struct virtio_net_config) and the onboard
> cap size.

Rethink of this, I wonder what's the value of exposing device
implementation details to users? Anyhow the parent is in charge of
"emulating" config space accessing.

If we do this, it's probably a blocker for cross vendor stuff.

Thanks

>
> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> ---
>  drivers/vdpa/ifcvf/ifcvf_base.c | 13 +++++++++++--
>  drivers/vdpa/ifcvf/ifcvf_base.h |  2 ++
>  2 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
> index 48c4dadb0c7c..fb957b57941e 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_base.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_base.c
> @@ -128,6 +128,7 @@ int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *pdev)
>                         break;
>                 case VIRTIO_PCI_CAP_DEVICE_CFG:
>                         hw->dev_cfg = get_cap_addr(hw, &cap);
> +                       hw->cap_dev_config_size = le32_to_cpu(cap.length);
>                         IFCVF_DBG(pdev, "hw->dev_cfg = %p\n", hw->dev_cfg);
>                         break;
>                 }
> @@ -233,15 +234,23 @@ int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features)
>  u32 ifcvf_get_config_size(struct ifcvf_hw *hw)
>  {
>         struct ifcvf_adapter *adapter;
> +       u32 net_config_size = sizeof(struct virtio_net_config);
> +       u32 blk_config_size = sizeof(struct virtio_blk_config);
> +       u32 cap_size = hw->cap_dev_config_size;
>         u32 config_size;
>
>         adapter = vf_to_adapter(hw);
> +       /* If the onboard device config space size is greater than
> +        * the size of struct virtio_net/blk_config, only the spec
> +        * implementing contents size is returned, this is very
> +        * unlikely, defensive programming.
> +        */
>         switch (hw->dev_type) {
>         case VIRTIO_ID_NET:
> -               config_size = sizeof(struct virtio_net_config);
> +               config_size = cap_size >= net_config_size ? net_config_size : cap_size;
>                 break;
>         case VIRTIO_ID_BLOCK:
> -               config_size = sizeof(struct virtio_blk_config);
> +               config_size = cap_size >= blk_config_size ? blk_config_size : cap_size;
>                 break;
>         default:
>                 config_size = 0;
> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
> index 115b61f4924b..f5563f665cc6 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_base.h
> +++ b/drivers/vdpa/ifcvf/ifcvf_base.h
> @@ -87,6 +87,8 @@ struct ifcvf_hw {
>         int config_irq;
>         int vqs_reused_irq;
>         u16 nr_vring;
> +       /* VIRTIO_PCI_CAP_DEVICE_CFG size */
> +       u32 cap_dev_config_size;
>  };
>
>  struct ifcvf_adapter {
> --
> 2.31.1
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 2/6] vDPA/ifcvf: support userspace to query features and MQ of a management device
  2022-07-01 13:28 ` [PATCH V3 2/6] vDPA/ifcvf: support userspace to query features and MQ of a management device Zhu Lingshan
@ 2022-07-04  4:43   ` Jason Wang
  2022-07-08  6:54     ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Jason Wang @ 2022-07-04  4:43 UTC (permalink / raw)
  To: Zhu Lingshan, mst; +Cc: virtualization, netdev, parav, xieyongji, gautam.dawar


在 2022/7/1 21:28, Zhu Lingshan 写道:
> Adapting to current netlink interfaces, this commit allows userspace
> to query feature bits and MQ capability of a management device.
>
> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> ---
>   drivers/vdpa/ifcvf/ifcvf_base.c | 12 ++++++++++++
>   drivers/vdpa/ifcvf/ifcvf_base.h |  1 +
>   drivers/vdpa/ifcvf/ifcvf_main.c |  3 +++
>   3 files changed, 16 insertions(+)
>
> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
> index fb957b57941e..7c5f1cc93ad9 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_base.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_base.c
> @@ -346,6 +346,18 @@ int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u16 num)
>   	return 0;
>   }
>   
> +u16 ifcvf_get_max_vq_pairs(struct ifcvf_hw *hw)
> +{
> +	struct virtio_net_config __iomem *config;
> +	u16 val, mq;
> +
> +	config = hw->dev_cfg;
> +	val = vp_ioread16((__le16 __iomem *)&config->max_virtqueue_pairs);
> +	mq = le16_to_cpu((__force __le16)val);
> +
> +	return mq;
> +}
> +
>   static int ifcvf_hw_enable(struct ifcvf_hw *hw)
>   {
>   	struct virtio_pci_common_cfg __iomem *cfg;
> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
> index f5563f665cc6..d54a1bed212e 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_base.h
> +++ b/drivers/vdpa/ifcvf/ifcvf_base.h
> @@ -130,6 +130,7 @@ u64 ifcvf_get_hw_features(struct ifcvf_hw *hw);
>   int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features);
>   u16 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid);
>   int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u16 num);
> +u16 ifcvf_get_max_vq_pairs(struct ifcvf_hw *hw);
>   struct ifcvf_adapter *vf_to_adapter(struct ifcvf_hw *hw);
>   int ifcvf_probed_virtio_net(struct ifcvf_hw *hw);
>   u32 ifcvf_get_config_size(struct ifcvf_hw *hw);
> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
> index 0a5670729412..3ff7096d30f1 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> @@ -791,6 +791,9 @@ static int ifcvf_vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
>   	vf->hw_features = ifcvf_get_hw_features(vf);
>   	vf->config_size = ifcvf_get_config_size(vf);
>   
> +	ifcvf_mgmt_dev->mdev.max_supported_vqs = ifcvf_get_max_vq_pairs(vf);


Do we want #qps or #queues?

FYI, vp_vdpa did:

drivers/vdpa/virtio_pci/vp_vdpa.c: mgtdev->max_supported_vqs = 
vp_modern_get_num_queues(mdev);

Thanks


> +	ifcvf_mgmt_dev->mdev.supported_features = vf->hw_features;
> +
>   	adapter->vdpa.mdev = &ifcvf_mgmt_dev->mdev;
>   	ret = _vdpa_register_device(&adapter->vdpa, vf->nr_vring);
>   	if (ret) {


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-01 22:02   ` Parav Pandit
@ 2022-07-04  4:46     ` Jason Wang
  2022-07-04 12:53       ` Parav Pandit
  2022-07-08  6:16     ` Zhu, Lingshan
  1 sibling, 1 reply; 113+ messages in thread
From: Jason Wang @ 2022-07-04  4:46 UTC (permalink / raw)
  To: Parav Pandit, Zhu Lingshan, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


在 2022/7/2 06:02, Parav Pandit 写道:
>
>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>> Sent: Friday, July 1, 2022 9:28 AM
>>
>> This commit adds a new vDPA netlink attribution
>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
>> features of vDPA devices through this new attr.
>>
>> Fixes: a64917bc2e9b vdpa: (Provide interface to read driver feature)
> Missing the "" in the line.
> I reviewed the patches again.
>
> However, this is not the fix.
> A fix cannot add a new UAPI.
>
> Code is already considering negotiated driver features to return the device config space.
> Hence it is fine.
>
> This patch intents to provide device features to user space.
> First what vdpa device are capable of, are already returned by features attribute on the management device.
> This is done in commit [1].
>
> The only reason to have it is, when one management device indicates that feature is supported, but device may end up not supporting this feature if such feature is shared with other devices on same physical device.
> For example all VFs may not be symmetric after large number of them are in use. In such case features bit of management device can differ (more features) than the vdpa device of this VF.
> Hence, showing on the device is useful.
>
> As mentioned before in V2, commit [1] has wrongly named the attribute to VDPA_ATTR_DEV_SUPPORTED_FEATURES.
> It should have been, VDPA_ATTR_DEV_MGMTDEV_SUPPORTED_FEATURES.
> Because it is in UAPI, and since we don't want to break compilation of iproute2,
> It cannot be renamed anymore.
>
> Given that, we do not want to start trend of naming device attributes with additional _VDPA_ to it as done in this patch.
> Error in commit [1] was exception.
>
> Hence, please reuse VDPA_ATTR_DEV_SUPPORTED_FEATURES to return for device features too.


This will probably break or confuse the existing userspace?

Thanks


>
> Secondly, you need output example for showing device features in the commit log.
>
> 3rd, please drop the fixes tag as new capability is not a fix.
>
> [1] cd2629f6df1c ("vdpa: Support reporting max device capabilities ")
>
>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>> ---
>>   drivers/vdpa/vdpa.c       | 13 +++++++++----
>>   include/uapi/linux/vdpa.h |  1 +
>>   2 files changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>> ebf2f363fbe7..9b0e39b2f022 100644
>> --- a/drivers/vdpa/vdpa.c
>> +++ b/drivers/vdpa/vdpa.c
>> @@ -815,7 +815,7 @@ static int vdpa_dev_net_mq_config_fill(struct
>> vdpa_device *vdev,  static int vdpa_dev_net_config_fill(struct vdpa_device
>> *vdev, struct sk_buff *msg)  {
>>   	struct virtio_net_config config = {};
>> -	u64 features;
>> +	u64 features_device, features_driver;
>>   	u16 val_u16;
>>
>>   	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config)); @@ -
>> 832,12 +832,17 @@ static int vdpa_dev_net_config_fill(struct vdpa_device
>> *vdev, struct sk_buff *ms
>>   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>   		return -EMSGSIZE;
>>
>> -	features = vdev->config->get_driver_features(vdev);
>> -	if (nla_put_u64_64bit(msg,
>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>> +	features_driver = vdev->config->get_driver_features(vdev);
>> +	if (nla_put_u64_64bit(msg,
>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features_driver,
>> +			      VDPA_ATTR_PAD))
>> +		return -EMSGSIZE;
>> +
>> +	features_device = vdev->config->get_device_features(vdev);
>> +	if (nla_put_u64_64bit(msg,
>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES,
>> +features_device,
>>   			      VDPA_ATTR_PAD))
>>   		return -EMSGSIZE;
>>
>> -	return vdpa_dev_net_mq_config_fill(vdev, msg, features, &config);
>> +	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver,
>> +&config);
>>   }
>>
>>   static int
>> diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h index
>> 25c55cab3d7c..39f1c3d7c112 100644
>> --- a/include/uapi/linux/vdpa.h
>> +++ b/include/uapi/linux/vdpa.h
>> @@ -47,6 +47,7 @@ enum vdpa_attr {
>>   	VDPA_ATTR_DEV_NEGOTIATED_FEATURES,	/* u64 */
>>   	VDPA_ATTR_DEV_MGMTDEV_MAX_VQS,		/* u32 */
>>   	VDPA_ATTR_DEV_SUPPORTED_FEATURES,	/* u64 */
>> +	VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES,	/* u64 */
>>
>>   	VDPA_ATTR_DEV_QUEUE_INDEX,              /* u32 */
>>   	VDPA_ATTR_DEV_VENDOR_ATTR_NAME,		/* string */
>> --
>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-04  4:46     ` Jason Wang
@ 2022-07-04 12:53       ` Parav Pandit
  2022-07-05  7:59         ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Parav Pandit @ 2022-07-04 12:53 UTC (permalink / raw)
  To: Jason Wang, Zhu Lingshan, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


> From: Jason Wang <jasowang@redhat.com>
> Sent: Monday, July 4, 2022 12:47 AM
> 
> 
> 在 2022/7/2 06:02, Parav Pandit 写道:
> >
> >> From: Zhu Lingshan <lingshan.zhu@intel.com>
> >> Sent: Friday, July 1, 2022 9:28 AM
> >>
> >> This commit adds a new vDPA netlink attribution
> >> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
> features
> >> of vDPA devices through this new attr.
> >>
> >> Fixes: a64917bc2e9b vdpa: (Provide interface to read driver feature)
> > Missing the "" in the line.
> > I reviewed the patches again.
> >
> > However, this is not the fix.
> > A fix cannot add a new UAPI.
> >
> > Code is already considering negotiated driver features to return the device
> config space.
> > Hence it is fine.
> >
> > This patch intents to provide device features to user space.
> > First what vdpa device are capable of, are already returned by features
> attribute on the management device.
> > This is done in commit [1].
> >
> > The only reason to have it is, when one management device indicates that
> feature is supported, but device may end up not supporting this feature if
> such feature is shared with other devices on same physical device.
> > For example all VFs may not be symmetric after large number of them are
> in use. In such case features bit of management device can differ (more
> features) than the vdpa device of this VF.
> > Hence, showing on the device is useful.
> >
> > As mentioned before in V2, commit [1] has wrongly named the attribute to
> VDPA_ATTR_DEV_SUPPORTED_FEATURES.
> > It should have been,
> VDPA_ATTR_DEV_MGMTDEV_SUPPORTED_FEATURES.
> > Because it is in UAPI, and since we don't want to break compilation of
> > iproute2, It cannot be renamed anymore.
> >
> > Given that, we do not want to start trend of naming device attributes with
> additional _VDPA_ to it as done in this patch.
> > Error in commit [1] was exception.
> >
> > Hence, please reuse VDPA_ATTR_DEV_SUPPORTED_FEATURES to return
> for device features too.
> 
> 
> This will probably break or confuse the existing userspace?
>
It shouldn't break, because its new attribute on the device.
All attributes are per command, so old one will not be confused either.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-04 12:53       ` Parav Pandit
@ 2022-07-05  7:59         ` Zhu, Lingshan
  2022-07-05 11:56           ` Parav Pandit
  0 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-05  7:59 UTC (permalink / raw)
  To: Parav Pandit, Jason Wang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/4/2022 8:53 PM, Parav Pandit wrote:
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Monday, July 4, 2022 12:47 AM
>>
>>
>> 在 2022/7/2 06:02, Parav Pandit 写道:
>>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>
>>>> This commit adds a new vDPA netlink attribution
>>>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
>> features
>>>> of vDPA devices through this new attr.
>>>>
>>>> Fixes: a64917bc2e9b vdpa: (Provide interface to read driver feature)
>>> Missing the "" in the line.
>>> I reviewed the patches again.
>>>
>>> However, this is not the fix.
>>> A fix cannot add a new UAPI.
>>>
>>> Code is already considering negotiated driver features to return the device
>> config space.
>>> Hence it is fine.
>>>
>>> This patch intents to provide device features to user space.
>>> First what vdpa device are capable of, are already returned by features
>> attribute on the management device.
>>> This is done in commit [1].
>>>
>>> The only reason to have it is, when one management device indicates that
>> feature is supported, but device may end up not supporting this feature if
>> such feature is shared with other devices on same physical device.
>>> For example all VFs may not be symmetric after large number of them are
>> in use. In such case features bit of management device can differ (more
>> features) than the vdpa device of this VF.
>>> Hence, showing on the device is useful.
>>>
>>> As mentioned before in V2, commit [1] has wrongly named the attribute to
>> VDPA_ATTR_DEV_SUPPORTED_FEATURES.
>>> It should have been,
>> VDPA_ATTR_DEV_MGMTDEV_SUPPORTED_FEATURES.
>>> Because it is in UAPI, and since we don't want to break compilation of
>>> iproute2, It cannot be renamed anymore.
>>>
>>> Given that, we do not want to start trend of naming device attributes with
>> additional _VDPA_ to it as done in this patch.
>>> Error in commit [1] was exception.
>>>
>>> Hence, please reuse VDPA_ATTR_DEV_SUPPORTED_FEATURES to return
>> for device features too.
>>
>>
>> This will probably break or confuse the existing userspace?
>>
> It shouldn't break, because its new attribute on the device.
> All attributes are per command, so old one will not be confused either.
A netlink attr should has its own and unique purpose, that's why we 
don't need locks for the attrs, only one consumer and only one producer.
I am afraid re-using (for both management device and the vDPA device) 
the attr VDPA_ATTR_DEV_SUPPORTED_FEATURES would lead to new race condition.
E.g., There are possibilities of querying FEATURES of a management 
device and a vDPA device simultaneously, or can there be a syncing issue 
in a tick?

IMHO, I don't see any advantages of re-using this attr.

Thanks,
Zhu Lingshan


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-05  7:59         ` Zhu, Lingshan
@ 2022-07-05 11:56           ` Parav Pandit
  2022-07-05 16:56             ` Zhu, Lingshan
  2022-07-27  8:15             ` Si-Wei Liu
  0 siblings, 2 replies; 113+ messages in thread
From: Parav Pandit @ 2022-07-05 11:56 UTC (permalink / raw)
  To: Zhu, Lingshan, Jason Wang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Tuesday, July 5, 2022 3:59 AM
> 
> 
> On 7/4/2022 8:53 PM, Parav Pandit wrote:
> >> From: Jason Wang <jasowang@redhat.com>
> >> Sent: Monday, July 4, 2022 12:47 AM
> >>
> >>
> >> 在 2022/7/2 06:02, Parav Pandit 写道:
> >>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
> >>>> Sent: Friday, July 1, 2022 9:28 AM
> >>>>
> >>>> This commit adds a new vDPA netlink attribution
> >>>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
> >> features
> >>>> of vDPA devices through this new attr.
> >>>>
> >>>> Fixes: a64917bc2e9b vdpa: (Provide interface to read driver
> >>>> feature)
> >>> Missing the "" in the line.
> >>> I reviewed the patches again.
> >>>
> >>> However, this is not the fix.
> >>> A fix cannot add a new UAPI.
> >>>
> >>> Code is already considering negotiated driver features to return the
> >>> device
> >> config space.
> >>> Hence it is fine.
> >>>
> >>> This patch intents to provide device features to user space.
> >>> First what vdpa device are capable of, are already returned by
> >>> features
> >> attribute on the management device.
> >>> This is done in commit [1].
> >>>
> >>> The only reason to have it is, when one management device indicates
> >>> that
> >> feature is supported, but device may end up not supporting this
> >> feature if such feature is shared with other devices on same physical device.
> >>> For example all VFs may not be symmetric after large number of them
> >>> are
> >> in use. In such case features bit of management device can differ
> >> (more
> >> features) than the vdpa device of this VF.
> >>> Hence, showing on the device is useful.
> >>>
> >>> As mentioned before in V2, commit [1] has wrongly named the
> >>> attribute to
> >> VDPA_ATTR_DEV_SUPPORTED_FEATURES.
> >>> It should have been,
> >> VDPA_ATTR_DEV_MGMTDEV_SUPPORTED_FEATURES.
> >>> Because it is in UAPI, and since we don't want to break compilation
> >>> of iproute2, It cannot be renamed anymore.
> >>>
> >>> Given that, we do not want to start trend of naming device
> >>> attributes with
> >> additional _VDPA_ to it as done in this patch.
> >>> Error in commit [1] was exception.
> >>>
> >>> Hence, please reuse VDPA_ATTR_DEV_SUPPORTED_FEATURES to return
> >> for device features too.
> >>
> >>
> >> This will probably break or confuse the existing userspace?
> >>
> > It shouldn't break, because its new attribute on the device.
> > All attributes are per command, so old one will not be confused either.
> A netlink attr should has its own and unique purpose, that's why we don't need
> locks for the attrs, only one consumer and only one producer.
> I am afraid re-using (for both management device and the vDPA device) the attr
> VDPA_ATTR_DEV_SUPPORTED_FEATURES would lead to new race condition.
> E.g., There are possibilities of querying FEATURES of a management device and
> a vDPA device simultaneously, or can there be a syncing issue in a tick?
Both can be queried simultaneously. Each will return their own feature bits using same attribute.
It wont lead to the race.

> 
> IMHO, I don't see any advantages of re-using this attr.

We don’t want to continue this mess of VDPA_DEV prefix for new attributes due to previous wrong naming.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-05 11:56           ` Parav Pandit
@ 2022-07-05 16:56             ` Zhu, Lingshan
  2022-07-05 17:01               ` Parav Pandit
  2022-07-27  8:15             ` Si-Wei Liu
  1 sibling, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-05 16:56 UTC (permalink / raw)
  To: Parav Pandit, Jason Wang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/5/2022 7:56 PM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Tuesday, July 5, 2022 3:59 AM
>>
>>
>> On 7/4/2022 8:53 PM, Parav Pandit wrote:
>>>> From: Jason Wang <jasowang@redhat.com>
>>>> Sent: Monday, July 4, 2022 12:47 AM
>>>>
>>>>
>>>> 在 2022/7/2 06:02, Parav Pandit 写道:
>>>>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>
>>>>>> This commit adds a new vDPA netlink attribution
>>>>>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
>>>> features
>>>>>> of vDPA devices through this new attr.
>>>>>>
>>>>>> Fixes: a64917bc2e9b vdpa: (Provide interface to read driver
>>>>>> feature)
>>>>> Missing the "" in the line.
>>>>> I reviewed the patches again.
>>>>>
>>>>> However, this is not the fix.
>>>>> A fix cannot add a new UAPI.
>>>>>
>>>>> Code is already considering negotiated driver features to return the
>>>>> device
>>>> config space.
>>>>> Hence it is fine.
>>>>>
>>>>> This patch intents to provide device features to user space.
>>>>> First what vdpa device are capable of, are already returned by
>>>>> features
>>>> attribute on the management device.
>>>>> This is done in commit [1].
>>>>>
>>>>> The only reason to have it is, when one management device indicates
>>>>> that
>>>> feature is supported, but device may end up not supporting this
>>>> feature if such feature is shared with other devices on same physical device.
>>>>> For example all VFs may not be symmetric after large number of them
>>>>> are
>>>> in use. In such case features bit of management device can differ
>>>> (more
>>>> features) than the vdpa device of this VF.
>>>>> Hence, showing on the device is useful.
>>>>>
>>>>> As mentioned before in V2, commit [1] has wrongly named the
>>>>> attribute to
>>>> VDPA_ATTR_DEV_SUPPORTED_FEATURES.
>>>>> It should have been,
>>>> VDPA_ATTR_DEV_MGMTDEV_SUPPORTED_FEATURES.
>>>>> Because it is in UAPI, and since we don't want to break compilation
>>>>> of iproute2, It cannot be renamed anymore.
>>>>>
>>>>> Given that, we do not want to start trend of naming device
>>>>> attributes with
>>>> additional _VDPA_ to it as done in this patch.
>>>>> Error in commit [1] was exception.
>>>>>
>>>>> Hence, please reuse VDPA_ATTR_DEV_SUPPORTED_FEATURES to return
>>>> for device features too.
>>>>
>>>>
>>>> This will probably break or confuse the existing userspace?
>>>>
>>> It shouldn't break, because its new attribute on the device.
>>> All attributes are per command, so old one will not be confused either.
>> A netlink attr should has its own and unique purpose, that's why we don't need
>> locks for the attrs, only one consumer and only one producer.
>> I am afraid re-using (for both management device and the vDPA device) the attr
>> VDPA_ATTR_DEV_SUPPORTED_FEATURES would lead to new race condition.
>> E.g., There are possibilities of querying FEATURES of a management device and
>> a vDPA device simultaneously, or can there be a syncing issue in a tick?
> Both can be queried simultaneously. Each will return their own feature bits using same attribute.
> It wont lead to the race.
How? It is just a piece of memory, xxxx[attr], do you see locks in 
nla_put_u64_64bit()? It is a typical
race condition, data accessed by multiple producers / consumers.
And re-use a netlink attr is really confusing.
>
>> IMHO, I don't see any advantages of re-using this attr.
> We don’t want to continue this mess of VDPA_DEV prefix for new attributes due to previous wrong naming.
as you point out before, is is a wrong naming, we can't re-nmme it 
because we don't want to break uAPI,
so there needs a new attr, if you don't like the name 
VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, it is more
than welcome to suggest a new one

Thanks


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-05 16:56             ` Zhu, Lingshan
@ 2022-07-05 17:01               ` Parav Pandit
  2022-07-06  2:25                 ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Parav Pandit @ 2022-07-05 17:01 UTC (permalink / raw)
  To: Zhu, Lingshan, Jason Wang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Tuesday, July 5, 2022 12:56 PM
> > Both can be queried simultaneously. Each will return their own feature bits
> using same attribute.
> > It wont lead to the race.
> How? It is just a piece of memory, xxxx[attr], do you see locks in
> nla_put_u64_64bit()? It is a typical race condition, data accessed by multiple
> producers / consumers.
No. There is no race condition in here.
And new attribute enum by no means avoid any race.

Data put using nla_put cannot be accessed until they are transferred.

> And re-use a netlink attr is really confusing.
Please put comment for this variable explaining why it is shared for the exception.

Before that lets start, can you share a real world example of when this feature bitmap will have different value than the mgmt. device bitmap value?

> >
> >> IMHO, I don't see any advantages of re-using this attr.
> > We don’t want to continue this mess of VDPA_DEV prefix for new
> attributes due to previous wrong naming.
> as you point out before, is is a wrong naming, we can't re-nmme it because
> we don't want to break uAPI, so there needs a new attr, if you don't like the
> name VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, it is more than
> welcome to suggest a new one
> 
> Thanks


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-05 17:01               ` Parav Pandit
@ 2022-07-06  2:25                 ` Zhu, Lingshan
  2022-07-06  2:28                   ` Parav Pandit
  2022-07-23 11:27                   ` Zhu, Lingshan
  0 siblings, 2 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-06  2:25 UTC (permalink / raw)
  To: Parav Pandit, Jason Wang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/6/2022 1:01 AM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Tuesday, July 5, 2022 12:56 PM
>>> Both can be queried simultaneously. Each will return their own feature bits
>> using same attribute.
>>> It wont lead to the race.
>> How? It is just a piece of memory, xxxx[attr], do you see locks in
>> nla_put_u64_64bit()? It is a typical race condition, data accessed by multiple
>> producers / consumers.
> No. There is no race condition in here.
> And new attribute enum by no means avoid any race.
>
> Data put using nla_put cannot be accessed until they are transferred.
How this is guaranteed? Do you see errors when calling nla_put_xxx() twice?
>
>> And re-use a netlink attr is really confusing.
> Please put comment for this variable explaining why it is shared for the exception.
>
> Before that lets start, can you share a real world example of when this feature bitmap will have different value than the mgmt. device bitmap value?
For example,
1. When migrate the VM to a node which has a more resourceful device. If 
the source side device does not have MQ, RSS or TSO feature, the vDPA 
device assigned to the VM does not
have MQ, RSS or TSO as well. When migrating to a node which has a device 
with MQ, RSS or TSO, to provide a consistent network device to the 
guest, to be transparent to the guest,
we need to mask out MQ, RSS or TSO in the vDPA device when provisioning. 
This is an example that management device may have different feature 
bits than the vDPA device.

2.SIOV, if a virtio device is capable of managing SIOV devices, and it 
exposes this capability by a feature bit(Like what I am doing in the 
"transport virtqueue"),
we don't want the SIOV ADIs have SIOV features, so the ADIs don't have 
SIOV feature bit.

Thanks
>
>>>> IMHO, I don't see any advantages of re-using this attr.
>>> We don’t want to continue this mess of VDPA_DEV prefix for new
>> attributes due to previous wrong naming.
>> as you point out before, is is a wrong naming, we can't re-nmme it because
>> we don't want to break uAPI, so there needs a new attr, if you don't like the
>> name VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, it is more than
>> welcome to suggest a new one
>>
>> Thanks


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-06  2:25                 ` Zhu, Lingshan
@ 2022-07-06  2:28                   ` Parav Pandit
  2022-07-23 11:27                   ` Zhu, Lingshan
  1 sibling, 0 replies; 113+ messages in thread
From: Parav Pandit @ 2022-07-06  2:28 UTC (permalink / raw)
  To: Zhu, Lingshan, Jason Wang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Tuesday, July 5, 2022 10:25 PM
> 1. When migrate the VM to a node which has a more resourceful device. If
> the source side device does not have MQ, RSS or TSO feature, the vDPA
> device assigned to the VM does not have MQ, RSS or TSO as well. When
> migrating to a node which has a device with MQ, RSS or TSO, to provide a
> consistent network device to the guest, to be transparent to the guest, we
> need to mask out MQ, RSS or TSO in the vDPA device when provisioning.
> This is an example that management device may have different feature bits
> than the vDPA device.
Yes. Right. 

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-01 22:02   ` Parav Pandit
  2022-07-04  4:46     ` Jason Wang
@ 2022-07-08  6:16     ` Zhu, Lingshan
  2022-07-08 16:13       ` Parav Pandit
  1 sibling, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-08  6:16 UTC (permalink / raw)
  To: Parav Pandit, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/2/2022 6:02 AM, Parav Pandit wrote:
>
>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>> Sent: Friday, July 1, 2022 9:28 AM
>>
>> This commit adds a new vDPA netlink attribution
>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
>> features of vDPA devices through this new attr.
>>
>> Fixes: a64917bc2e9b vdpa: (Provide interface to read driver feature)
> Missing the "" in the line.
will fix
> I reviewed the patches again.
>
> However, this is not the fix.
> A fix cannot add a new UAPI.
I think we have discussed this, on why we can not re-name the existing 
wrong named attr, and why we can not re-use the attr.
So are you suggesting remove this fixes tag?
And why a fix can not add a new uAPI?
>
> Code is already considering negotiated driver features to return the device config space.
> Hence it is fine.
No, the spec says:
The device MUST allow reading of any device-specific configuration
field before FEATURES_OK is set by the driver.
>
> This patch intents to provide device features to user space.
> First what vdpa device are capable of, are already returned by features attribute on the management device.
> This is done in commit [1].
we have discussed this in another thread, vDPA device feature bits can 
be different from the management device feature bits.
>
> The only reason to have it is, when one management device indicates that feature is supported, but device may end up not supporting this feature if such feature is shared with other devices on same physical device.
> For example all VFs may not be symmetric after large number of them are in use. In such case features bit of management device can differ (more features) than the vdpa device of this VF.
> Hence, showing on the device is useful.
>
> As mentioned before in V2, commit [1] has wrongly named the attribute to VDPA_ATTR_DEV_SUPPORTED_FEATURES.
> It should have been, VDPA_ATTR_DEV_MGMTDEV_SUPPORTED_FEATURES.
> Because it is in UAPI, and since we don't want to break compilation of iproute2,
> It cannot be renamed anymore.
Yes, rename it will break current uAPI, so I can not rename it.
>
> Given that, we do not want to start trend of naming device attributes with additional _VDPA_ to it as done in this patch.
> Error in commit [1] was exception.
>
> Hence, please reuse VDPA_ATTR_DEV_SUPPORTED_FEATURES to return for device features too.
>
> Secondly, you need output example for showing device features in the commit log.
>
> 3rd, please drop the fixes tag as new capability is not a fix.
>
> [1] cd2629f6df1c ("vdpa: Support reporting max device capabilities ")
>
>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>> ---
>>   drivers/vdpa/vdpa.c       | 13 +++++++++----
>>   include/uapi/linux/vdpa.h |  1 +
>>   2 files changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>> ebf2f363fbe7..9b0e39b2f022 100644
>> --- a/drivers/vdpa/vdpa.c
>> +++ b/drivers/vdpa/vdpa.c
>> @@ -815,7 +815,7 @@ static int vdpa_dev_net_mq_config_fill(struct
>> vdpa_device *vdev,  static int vdpa_dev_net_config_fill(struct vdpa_device
>> *vdev, struct sk_buff *msg)  {
>>   	struct virtio_net_config config = {};
>> -	u64 features;
>> +	u64 features_device, features_driver;
>>   	u16 val_u16;
>>
>>   	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config)); @@ -
>> 832,12 +832,17 @@ static int vdpa_dev_net_config_fill(struct vdpa_device
>> *vdev, struct sk_buff *ms
>>   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>   		return -EMSGSIZE;
>>
>> -	features = vdev->config->get_driver_features(vdev);
>> -	if (nla_put_u64_64bit(msg,
>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>> +	features_driver = vdev->config->get_driver_features(vdev);
>> +	if (nla_put_u64_64bit(msg,
>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features_driver,
>> +			      VDPA_ATTR_PAD))
>> +		return -EMSGSIZE;
>> +
>> +	features_device = vdev->config->get_device_features(vdev);
>> +	if (nla_put_u64_64bit(msg,
>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES,
>> +features_device,
>>   			      VDPA_ATTR_PAD))
>>   		return -EMSGSIZE;
>>
>> -	return vdpa_dev_net_mq_config_fill(vdev, msg, features, &config);
>> +	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver,
>> +&config);
>>   }
>>
>>   static int
>> diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h index
>> 25c55cab3d7c..39f1c3d7c112 100644
>> --- a/include/uapi/linux/vdpa.h
>> +++ b/include/uapi/linux/vdpa.h
>> @@ -47,6 +47,7 @@ enum vdpa_attr {
>>   	VDPA_ATTR_DEV_NEGOTIATED_FEATURES,	/* u64 */
>>   	VDPA_ATTR_DEV_MGMTDEV_MAX_VQS,		/* u32 */
>>   	VDPA_ATTR_DEV_SUPPORTED_FEATURES,	/* u64 */
>> +	VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES,	/* u64 */
>>
>>   	VDPA_ATTR_DEV_QUEUE_INDEX,              /* u32 */
>>   	VDPA_ATTR_DEV_VENDOR_ATTR_NAME,		/* string */
>> --
>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-01 22:07   ` Parav Pandit
@ 2022-07-08  6:21     ` Zhu, Lingshan
  2022-07-08 16:23       ` Parav Pandit
  2022-07-13  5:26     ` Michael S. Tsirkin
  1 sibling, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-08  6:21 UTC (permalink / raw)
  To: Parav Pandit, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/2/2022 6:07 AM, Parav Pandit wrote:
>
>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>> Sent: Friday, July 1, 2022 9:28 AM
>> If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue pair, so
>> when userspace querying queue pair numbers, it should return mq=1 than
>> zero.
>>
>> Function vdpa_dev_net_config_fill() fills the attributions of the vDPA
>> devices, so that it should call vdpa_dev_net_mq_config_fill() so the
>> parameter in vdpa_dev_net_mq_config_fill() should be feature_device than
>> feature_driver for the vDPA devices themselves
>>
>> Before this change, when MQ = 0, iproute2 output:
>> $vdpa dev config show vdpa0
>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 0
>> mtu 1500
>>
> The fix belongs to user space.
> When a feature bit _MQ is not negotiated, vdpa kernel space will not add attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
> When such attribute is not returned by kernel, max_vq_pairs should not be shown by the iproute2.
I think userspace tool does not need to care whether MQ is offered or 
negotiated, it just needs to read the number of queues
there, so if no MQ, it is not "not any queues", there are still 1 queue 
pair to be a virtio-net device, means two queues.

If not, how can you tell the user there are only 2 queues? The end users 
may don't know this is default. They may misunderstand this
as an error or defects.
>
> We have many config space fields that depend on the feature bits and some of them do not have any defaults.
> To keep consistency of existence of config space fields among all, we don't want to show default like below.
>
> Please fix the iproute2 to not print max_vq_pairs when it is not returned by the kernel.
>   
>> After applying this commit, when MQ = 0, iproute2 output:
>> $vdpa dev config show vdpa0
>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 1
>> mtu 1500
>>
>> Fixes: a64917bc2e9b (vdpa: Provide interface to read driver features)
>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>> ---
>>   drivers/vdpa/vdpa.c | 7 ++++---
>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>> d76b22b2f7ae..846dd37f3549 100644
>> --- a/drivers/vdpa/vdpa.c
>> +++ b/drivers/vdpa/vdpa.c
>> @@ -806,9 +806,10 @@ static int vdpa_dev_net_mq_config_fill(struct
>> vdpa_device *vdev,
>>   	u16 val_u16;
>>
>>   	if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0)
>> -		return 0;
>> +		val_u16 = 1;
>> +	else
>> +		val_u16 = __virtio16_to_cpu(true, config-
>>> max_virtqueue_pairs);
>> -	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
>>   	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
>> val_u16);  }
>>
>> @@ -842,7 +843,7 @@ static int vdpa_dev_net_config_fill(struct
>> vdpa_device *vdev, struct sk_buff *ms
>>   			      VDPA_ATTR_PAD))
>>   		return -EMSGSIZE;
>>
>> -	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver,
>> &config);
>> +	return vdpa_dev_net_mq_config_fill(vdev, msg, features_device,
>> +&config);
>>   }
>>
>>   static int
>> --
>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-07-01 22:12   ` Parav Pandit
@ 2022-07-08  6:22     ` Zhu, Lingshan
  2022-07-13  5:23     ` Michael S. Tsirkin
       [not found]     ` <00889067-50ac-d2cd-675f-748f171e5c83@oracle.com>
  2 siblings, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-08  6:22 UTC (permalink / raw)
  To: Parav Pandit, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/2/2022 6:12 AM, Parav Pandit wrote:
>
>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>> Sent: Friday, July 1, 2022 9:28 AM
>>
>> Users may want to query the config space of a vDPA device, to choose a
>> appropriate one for a certain guest. This means the users need to read the
>> config space before FEATURES_OK, and the existence of config space
>> contents does not depend on FEATURES_OK.
>>
>> The spec says:
>> The device MUST allow reading of any device-specific configuration field
>> before FEATURES_OK is set by the driver. This includes fields which are
>> conditional on feature bits, as long as those feature bits are offered by the
>> device.
>>
>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
> Fix is fine, but fixes tag needs correction described below.
>
> Above commit id is 13 letters should be 12.
> And
> It should be in format
> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if FEATURES_OK")
>
> Please use checkpatch.pl script before posting the patches to catch these errors.
> There is a bot that looks at the fixes tag and identifies the right kernel version to apply this fix.
strange, checkpatch.pl did not complain this, I will fix this tag. Thanks
>
>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>> ---
>>   drivers/vdpa/vdpa.c | 8 --------
>>   1 file changed, 8 deletions(-)
>>
>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>> 9b0e39b2f022..d76b22b2f7ae 100644
>> --- a/drivers/vdpa/vdpa.c
>> +++ b/drivers/vdpa/vdpa.c
>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>> struct sk_buff *msg, u32 portid,  {
>>   	u32 device_id;
>>   	void *hdr;
>> -	u8 status;
>>   	int err;
>>
>>   	down_read(&vdev->cf_lock);
>> -	status = vdev->config->get_status(vdev);
>> -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>> -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>> completed");
>> -		err = -EAGAIN;
>> -		goto out;
>> -	}
>> -
>>   	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>   			  VDPA_CMD_DEV_CONFIG_GET);
>>   	if (!hdr) {
>> --
>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-01 22:18   ` Parav Pandit
@ 2022-07-08  6:25     ` Zhu, Lingshan
  2022-07-08 16:08       ` Parav Pandit
  0 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-08  6:25 UTC (permalink / raw)
  To: Parav Pandit, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/2/2022 6:18 AM, Parav Pandit wrote:
>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>> Sent: Friday, July 1, 2022 9:28 AM
>>
>> This commit fixes spars warnings: cast to restricted __le16 in function
>> vdpa_dev_net_config_fill() and
>> vdpa_fill_stats_rec()
>>
> Missing fixes tag.
I am not sure whether this deserve a fix tag, I will look into it.
>   
> But I fail to understand the warning.
> config.status is le16, and API used is to convert le16 to cpu.
> What is the warning about, can you please explain?
The warnings are:
drivers/vdpa/vdpa.c:828:19: warning: cast to restricted __le16
drivers/vdpa/vdpa.c:828:19: warning: cast from restricted __virtio16
>
>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>> ---
>>   drivers/vdpa/vdpa.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>> 846dd37f3549..ed49fe46a79e 100644
>> --- a/drivers/vdpa/vdpa.c
>> +++ b/drivers/vdpa/vdpa.c
>> @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct
>> vdpa_device *vdev, struct sk_buff *ms
>>   		    config.mac))
>>   		return -EMSGSIZE;
>>
>> -	val_u16 = le16_to_cpu(config.status);
>> +	val_u16 = __virtio16_to_cpu(true, config.status);
>>   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>   		return -EMSGSIZE;
>>
>> -	val_u16 = le16_to_cpu(config.mtu);
>> +	val_u16 = __virtio16_to_cpu(true, config.mtu);
>>   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>   		return -EMSGSIZE;
>>
>> @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device
>> *vdev, struct sk_buff *msg,
>>   	}
>>   	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>
>> -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
>> +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
>>   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
>> max_vqp))
>>   		return -EMSGSIZE;
>>
>> --
>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation
  2022-07-04  4:39   ` Jason Wang
@ 2022-07-08  6:44     ` Zhu, Lingshan
  2022-07-13  5:44       ` Michael S. Tsirkin
  0 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-08  6:44 UTC (permalink / raw)
  To: Jason Wang
  Cc: mst, virtualization, netdev, Parav Pandit, Yongji Xie, Dawar, Gautam



On 7/4/2022 12:39 PM, Jason Wang wrote:
> On Fri, Jul 1, 2022 at 9:36 PM Zhu Lingshan <lingshan.zhu@intel.com> wrote:
>> ifcvf_get_config_size() should return a virtio device type specific value,
>> however the ret_value should not be greater than the onboard size of
>> the device implementation. E.g., for virtio_net, config_size should be
>> the minimum value of sizeof(struct virtio_net_config) and the onboard
>> cap size.
> Rethink of this, I wonder what's the value of exposing device
> implementation details to users? Anyhow the parent is in charge of
> "emulating" config space accessing.
This will not be exposed to the users, it is a ifcvf internal helper,
to get the actual device config space size.

For example, if ifcvf drives an Intel virtio-net device,
if the device config space size is greater than sizeof(struct 
virtio_net_cfg),
this means the device has something more than the spec, some private fields,
we don't want to expose these extra private fields to the users, so in 
this case,
we only return what the spec defines.

If the device config space size is less than sizeof(struct virtio_net_cfg),
means the device didn't implement all fields the spec defined, like no RSS.
In such cases, we only return what the device implemented.

So these are defensive programming.
>
> If we do this, it's probably a blocker for cross vendor stuff.
>
> Thanks
>
>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>> ---
>>   drivers/vdpa/ifcvf/ifcvf_base.c | 13 +++++++++++--
>>   drivers/vdpa/ifcvf/ifcvf_base.h |  2 ++
>>   2 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
>> index 48c4dadb0c7c..fb957b57941e 100644
>> --- a/drivers/vdpa/ifcvf/ifcvf_base.c
>> +++ b/drivers/vdpa/ifcvf/ifcvf_base.c
>> @@ -128,6 +128,7 @@ int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *pdev)
>>                          break;
>>                  case VIRTIO_PCI_CAP_DEVICE_CFG:
>>                          hw->dev_cfg = get_cap_addr(hw, &cap);
>> +                       hw->cap_dev_config_size = le32_to_cpu(cap.length);
>>                          IFCVF_DBG(pdev, "hw->dev_cfg = %p\n", hw->dev_cfg);
>>                          break;
>>                  }
>> @@ -233,15 +234,23 @@ int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features)
>>   u32 ifcvf_get_config_size(struct ifcvf_hw *hw)
>>   {
>>          struct ifcvf_adapter *adapter;
>> +       u32 net_config_size = sizeof(struct virtio_net_config);
>> +       u32 blk_config_size = sizeof(struct virtio_blk_config);
>> +       u32 cap_size = hw->cap_dev_config_size;
>>          u32 config_size;
>>
>>          adapter = vf_to_adapter(hw);
>> +       /* If the onboard device config space size is greater than
>> +        * the size of struct virtio_net/blk_config, only the spec
>> +        * implementing contents size is returned, this is very
>> +        * unlikely, defensive programming.
>> +        */
>>          switch (hw->dev_type) {
>>          case VIRTIO_ID_NET:
>> -               config_size = sizeof(struct virtio_net_config);
>> +               config_size = cap_size >= net_config_size ? net_config_size : cap_size;
>>                  break;
>>          case VIRTIO_ID_BLOCK:
>> -               config_size = sizeof(struct virtio_blk_config);
>> +               config_size = cap_size >= blk_config_size ? blk_config_size : cap_size;
>>                  break;
>>          default:
>>                  config_size = 0;
>> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
>> index 115b61f4924b..f5563f665cc6 100644
>> --- a/drivers/vdpa/ifcvf/ifcvf_base.h
>> +++ b/drivers/vdpa/ifcvf/ifcvf_base.h
>> @@ -87,6 +87,8 @@ struct ifcvf_hw {
>>          int config_irq;
>>          int vqs_reused_irq;
>>          u16 nr_vring;
>> +       /* VIRTIO_PCI_CAP_DEVICE_CFG size */
>> +       u32 cap_dev_config_size;
>>   };
>>
>>   struct ifcvf_adapter {
>> --
>> 2.31.1
>>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 2/6] vDPA/ifcvf: support userspace to query features and MQ of a management device
  2022-07-04  4:43   ` Jason Wang
@ 2022-07-08  6:54     ` Zhu, Lingshan
  0 siblings, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-08  6:54 UTC (permalink / raw)
  To: Jason Wang, mst; +Cc: virtualization, netdev, parav, xieyongji, gautam.dawar



On 7/4/2022 12:43 PM, Jason Wang wrote:
>
> 在 2022/7/1 21:28, Zhu Lingshan 写道:
>> Adapting to current netlink interfaces, this commit allows userspace
>> to query feature bits and MQ capability of a management device.
>>
>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>> ---
>>   drivers/vdpa/ifcvf/ifcvf_base.c | 12 ++++++++++++
>>   drivers/vdpa/ifcvf/ifcvf_base.h |  1 +
>>   drivers/vdpa/ifcvf/ifcvf_main.c |  3 +++
>>   3 files changed, 16 insertions(+)
>>
>> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c 
>> b/drivers/vdpa/ifcvf/ifcvf_base.c
>> index fb957b57941e..7c5f1cc93ad9 100644
>> --- a/drivers/vdpa/ifcvf/ifcvf_base.c
>> +++ b/drivers/vdpa/ifcvf/ifcvf_base.c
>> @@ -346,6 +346,18 @@ int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 
>> qid, u16 num)
>>       return 0;
>>   }
>>   +u16 ifcvf_get_max_vq_pairs(struct ifcvf_hw *hw)
>> +{
>> +    struct virtio_net_config __iomem *config;
>> +    u16 val, mq;
>> +
>> +    config = hw->dev_cfg;
>> +    val = vp_ioread16((__le16 __iomem *)&config->max_virtqueue_pairs);
>> +    mq = le16_to_cpu((__force __le16)val);
>> +
>> +    return mq;
>> +}
>> +
>>   static int ifcvf_hw_enable(struct ifcvf_hw *hw)
>>   {
>>       struct virtio_pci_common_cfg __iomem *cfg;
>> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h 
>> b/drivers/vdpa/ifcvf/ifcvf_base.h
>> index f5563f665cc6..d54a1bed212e 100644
>> --- a/drivers/vdpa/ifcvf/ifcvf_base.h
>> +++ b/drivers/vdpa/ifcvf/ifcvf_base.h
>> @@ -130,6 +130,7 @@ u64 ifcvf_get_hw_features(struct ifcvf_hw *hw);
>>   int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features);
>>   u16 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid);
>>   int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u16 num);
>> +u16 ifcvf_get_max_vq_pairs(struct ifcvf_hw *hw);
>>   struct ifcvf_adapter *vf_to_adapter(struct ifcvf_hw *hw);
>>   int ifcvf_probed_virtio_net(struct ifcvf_hw *hw);
>>   u32 ifcvf_get_config_size(struct ifcvf_hw *hw);
>> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c 
>> b/drivers/vdpa/ifcvf/ifcvf_main.c
>> index 0a5670729412..3ff7096d30f1 100644
>> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
>> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
>> @@ -791,6 +791,9 @@ static int ifcvf_vdpa_dev_add(struct 
>> vdpa_mgmt_dev *mdev, const char *name,
>>       vf->hw_features = ifcvf_get_hw_features(vf);
>>       vf->config_size = ifcvf_get_config_size(vf);
>>   +    ifcvf_mgmt_dev->mdev.max_supported_vqs = 
>> ifcvf_get_max_vq_pairs(vf);
>
>
> Do we want #qps or #queues?
>
> FYI, vp_vdpa did:
>
> drivers/vdpa/virtio_pci/vp_vdpa.c: mgtdev->max_supported_vqs = 
> vp_modern_get_num_queues(mdev);
Oh Yes, it should be the queues, will fix this

Thanks
>
> Thanks
>
>
>> + ifcvf_mgmt_dev->mdev.supported_features = vf->hw_features;
>> +
>>       adapter->vdpa.mdev = &ifcvf_mgmt_dev->mdev;
>>       ret = _vdpa_register_device(&adapter->vdpa, vf->nr_vring);
>>       if (ret) {
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-08  6:25     ` Zhu, Lingshan
@ 2022-07-08 16:08       ` Parav Pandit
  0 siblings, 0 replies; 113+ messages in thread
From: Parav Pandit @ 2022-07-08 16:08 UTC (permalink / raw)
  To: Zhu, Lingshan, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Friday, July 8, 2022 2:25 AM
> 
> 
> On 7/2/2022 6:18 AM, Parav Pandit wrote:
> >> From: Zhu Lingshan <lingshan.zhu@intel.com>
> >> Sent: Friday, July 1, 2022 9:28 AM
> >>
> >> This commit fixes spars warnings: cast to restricted __le16 in
> >> function
> >> vdpa_dev_net_config_fill() and
> >> vdpa_fill_stats_rec()
> >>
> > Missing fixes tag.
> I am not sure whether this deserve a fix tag, I will look into it.
> >
> > But I fail to understand the warning.
> > config.status is le16, and API used is to convert le16 to cpu.
> > What is the warning about, can you please explain?


I see it. Its not le16, its __virtio16.
Please add fixes tag.
With that Reviewed-by: Parav Pandit <parav@nvidia.com>

> The warnings are:
> drivers/vdpa/vdpa.c:828:19: warning: cast to restricted __le16
> drivers/vdpa/vdpa.c:828:19: warning: cast from restricted __virtio16
> >
> >> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> >> ---
> >>   drivers/vdpa/vdpa.c | 6 +++---
> >>   1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> >> 846dd37f3549..ed49fe46a79e 100644
> >> --- a/drivers/vdpa/vdpa.c
> >> +++ b/drivers/vdpa/vdpa.c
> >> @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct
> >> vdpa_device *vdev, struct sk_buff *ms
> >>   		    config.mac))
> >>   		return -EMSGSIZE;
> >>
> >> -	val_u16 = le16_to_cpu(config.status);
> >> +	val_u16 = __virtio16_to_cpu(true, config.status);
> >>   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> >>   		return -EMSGSIZE;
> >>
> >> -	val_u16 = le16_to_cpu(config.mtu);
> >> +	val_u16 = __virtio16_to_cpu(true, config.mtu);
> >>   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> >>   		return -EMSGSIZE;
> >>
> >> @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device
> >> *vdev, struct sk_buff *msg,
> >>   	}
> >>   	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> >>
> >> -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
> >> +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
> >>   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
> >> max_vqp))
> >>   		return -EMSGSIZE;
> >>
> >> --
> >> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-08  6:16     ` Zhu, Lingshan
@ 2022-07-08 16:13       ` Parav Pandit
  2022-07-11  2:18         ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Parav Pandit @ 2022-07-08 16:13 UTC (permalink / raw)
  To: Zhu, Lingshan, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Friday, July 8, 2022 2:16 AM
> 
> On 7/2/2022 6:02 AM, Parav Pandit wrote:
> >
> >> From: Zhu Lingshan <lingshan.zhu@intel.com>
> >> Sent: Friday, July 1, 2022 9:28 AM
> >>
> >> This commit adds a new vDPA netlink attribution
> >> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
> features
> >> of vDPA devices through this new attr.
> >>
> >> Fixes: a64917bc2e9b vdpa: (Provide interface to read driver feature)
> > Missing the "" in the line.
> will fix
> > I reviewed the patches again.
> >
> > However, this is not the fix.
> > A fix cannot add a new UAPI.
> I think we have discussed this, on why we can not re-name the existing
> wrong named attr, and why we can not re-use the attr.
> So are you suggesting remove this fixes tag?
> And why a fix can not add a new uAPI?

Because a new attribute cannot fix any existing attribute.

What is done in the patch is show current attributes of the vdpa device (which sometimes contains a different value than the mgmt. device).
So it is a new functionality that cannot have fixes tag.

> >
> > Code is already considering negotiated driver features to return the device
> config space.
> > Hence it is fine.
> No, the spec says:
> The device MUST allow reading of any device-specific configuration field
> before FEATURES_OK is set by the driver.
> >
> > This patch intents to provide device features to user space.
> > First what vdpa device are capable of, are already returned by features
> attribute on the management device.
> > This is done in commit [1].
> we have discussed this in another thread, vDPA device feature bits can be
> different from the management device feature bits.
> >
Yes. 
> > The only reason to have it is, when one management device indicates that
> feature is supported, but device may end up not supporting this feature if
> such feature is shared with other devices on same physical device.
> > For example all VFs may not be symmetric after large number of them are
> in use. In such case features bit of management device can differ (more
> features) than the vdpa device of this VF.
> > Hence, showing on the device is useful.
> >
> > As mentioned before in V2, commit [1] has wrongly named the attribute to
> VDPA_ATTR_DEV_SUPPORTED_FEATURES.
> > It should have been,
> VDPA_ATTR_DEV_MGMTDEV_SUPPORTED_FEATURES.
> > Because it is in UAPI, and since we don't want to break compilation of
> > iproute2, It cannot be renamed anymore.
> Yes, rename it will break current uAPI, so I can not rename it.
> >
I know, which is why this patch needs to do following listed changes described in previous email.

> > Given that, we do not want to start trend of naming device attributes with
> additional _VDPA_ to it as done in this patch.
> > Error in commit [1] was exception.
> >
> > Hence, please reuse VDPA_ATTR_DEV_SUPPORTED_FEATURES to return
> for device features too.
> >
> > Secondly, you need output example for showing device features in the
> commit log.
> >
> > 3rd, please drop the fixes tag as new capability is not a fix.
> >


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-08  6:21     ` Zhu, Lingshan
@ 2022-07-08 16:23       ` Parav Pandit
  2022-07-11  2:29         ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Parav Pandit @ 2022-07-08 16:23 UTC (permalink / raw)
  To: Zhu, Lingshan, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Friday, July 8, 2022 2:21 AM
> 
> 
> On 7/2/2022 6:07 AM, Parav Pandit wrote:
> >
> >> From: Zhu Lingshan <lingshan.zhu@intel.com>
> >> Sent: Friday, July 1, 2022 9:28 AM
> >> If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue
> >> pair, so when userspace querying queue pair numbers, it should return
> >> mq=1 than zero.
> >>
> >> Function vdpa_dev_net_config_fill() fills the attributions of the
> >> vDPA devices, so that it should call vdpa_dev_net_mq_config_fill() so
> >> the parameter in vdpa_dev_net_mq_config_fill() should be
> >> feature_device than feature_driver for the vDPA devices themselves
> >>
> >> Before this change, when MQ = 0, iproute2 output:
> >> $vdpa dev config show vdpa0
> >> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs
> >> 0 mtu 1500
> >>
> > The fix belongs to user space.
> > When a feature bit _MQ is not negotiated, vdpa kernel space will not add
> attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
> > When such attribute is not returned by kernel, max_vq_pairs should not
> be shown by the iproute2.
> I think userspace tool does not need to care whether MQ is offered or
> negotiated, it just needs to read the number of queues there, so if no MQ, it
> is not "not any queues", there are still 1 queue pair to be a virtio-net device,
> means two queues.
> 
> If not, how can you tell the user there are only 2 queues? The end users may
> don't know this is default. They may misunderstand this as an error or
> defects.
> >
When max_vq_pairs is not shown, it means that device didn’t expose MAX_VQ_PAIRS attribute to its guest users.
(Because _MQ was not negotiated).
It is not error or defect. 
It precisely shows what is exposed.

User space will care when it wants to turn off/on _MQ feature bits and MAX_QP values.

Showing max_vq_pairs of 1 even when _MQ is not negotiated, incorrectly says that max_vq_pairs is exposed to the guest, but it is not offered.

So, please fix the iproute2 to not print max_vq_pairs when it is not returned by the kernel.

> > We have many config space fields that depend on the feature bits and
> some of them do not have any defaults.
> > To keep consistency of existence of config space fields among all, we don't
> want to show default like below.
> >
> > Please fix the iproute2 to not print max_vq_pairs when it is not returned
> by the kernel.
> >
> >> After applying this commit, when MQ = 0, iproute2 output:
> >> $vdpa dev config show vdpa0
> >> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs
> >> 1 mtu 1500
> >>
> >> Fixes: a64917bc2e9b (vdpa: Provide interface to read driver features)
> >> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> >> ---
> >>   drivers/vdpa/vdpa.c | 7 ++++---
> >>   1 file changed, 4 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> >> d76b22b2f7ae..846dd37f3549 100644
> >> --- a/drivers/vdpa/vdpa.c
> >> +++ b/drivers/vdpa/vdpa.c
> >> @@ -806,9 +806,10 @@ static int vdpa_dev_net_mq_config_fill(struct
> >> vdpa_device *vdev,
> >>   	u16 val_u16;
> >>
> >>   	if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0)
> >> -		return 0;
> >> +		val_u16 = 1;
> >> +	else
> >> +		val_u16 = __virtio16_to_cpu(true, config-
> >>> max_virtqueue_pairs);
> >> -	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
> >>   	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
> val_u16);
> >> }
> >>
> >> @@ -842,7 +843,7 @@ static int vdpa_dev_net_config_fill(struct
> >> vdpa_device *vdev, struct sk_buff *ms
> >>   			      VDPA_ATTR_PAD))
> >>   		return -EMSGSIZE;
> >>
> >> -	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver,
> >> &config);
> >> +	return vdpa_dev_net_mq_config_fill(vdev, msg, features_device,
> >> +&config);
> >>   }
> >>
> >>   static int
> >> --
> >> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-08 16:13       ` Parav Pandit
@ 2022-07-11  2:18         ` Zhu, Lingshan
  0 siblings, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-11  2:18 UTC (permalink / raw)
  To: Parav Pandit, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/9/2022 12:13 AM, Parav Pandit wrote:
>
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Friday, July 8, 2022 2:16 AM
>>
>> On 7/2/2022 6:02 AM, Parav Pandit wrote:
>>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>
>>>> This commit adds a new vDPA netlink attribution
>>>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
>> features
>>>> of vDPA devices through this new attr.
>>>>
>>>> Fixes: a64917bc2e9b vdpa: (Provide interface to read driver feature)
>>> Missing the "" in the line.
>> will fix
>>> I reviewed the patches again.
>>>
>>> However, this is not the fix.
>>> A fix cannot add a new UAPI.
>> I think we have discussed this, on why we can not re-name the existing
>> wrong named attr, and why we can not re-use the attr.
>> So are you suggesting remove this fixes tag?
>> And why a fix can not add a new uAPI?
> Because a new attribute cannot fix any existing attribute.
>
> What is done in the patch is show current attributes of the vdpa device (which sometimes contains a different value than the mgmt. device).
> So it is a new functionality that cannot have fixes tag.
OK, I get the points now.
>
>>> Code is already considering negotiated driver features to return the device
>> config space.
>>> Hence it is fine.
>> No, the spec says:
>> The device MUST allow reading of any device-specific configuration field
>> before FEATURES_OK is set by the driver.
>>> This patch intents to provide device features to user space.
>>> First what vdpa device are capable of, are already returned by features
>> attribute on the management device.
>>> This is done in commit [1].
>> we have discussed this in another thread, vDPA device feature bits can be
>> different from the management device feature bits.
> Yes.
>>> The only reason to have it is, when one management device indicates that
>> feature is supported, but device may end up not supporting this feature if
>> such feature is shared with other devices on same physical device.
>>> For example all VFs may not be symmetric after large number of them are
>> in use. In such case features bit of management device can differ (more
>> features) than the vdpa device of this VF.
>>> Hence, showing on the device is useful.
>>>
>>> As mentioned before in V2, commit [1] has wrongly named the attribute to
>> VDPA_ATTR_DEV_SUPPORTED_FEATURES.
>>> It should have been,
>> VDPA_ATTR_DEV_MGMTDEV_SUPPORTED_FEATURES.
>>> Because it is in UAPI, and since we don't want to break compilation of
>>> iproute2, It cannot be renamed anymore.
>> Yes, rename it will break current uAPI, so I can not rename it.
> I know, which is why this patch needs to do following listed changes described in previous email.
>
>>> Given that, we do not want to start trend of naming device attributes with
>> additional _VDPA_ to it as done in this patch.
>>> Error in commit [1] was exception.
>>>
>>> Hence, please reuse VDPA_ATTR_DEV_SUPPORTED_FEATURES to return
>> for device features too.
>>> Secondly, you need output example for showing device features in the
>> commit log.
>>> 3rd, please drop the fixes tag as new capability is not a fix.
>>>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-08 16:23       ` Parav Pandit
@ 2022-07-11  2:29         ` Zhu, Lingshan
  2022-07-12 16:48           ` Parav Pandit
  0 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-11  2:29 UTC (permalink / raw)
  To: Parav Pandit, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/9/2022 12:23 AM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Friday, July 8, 2022 2:21 AM
>>
>>
>> On 7/2/2022 6:07 AM, Parav Pandit wrote:
>>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>> If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue
>>>> pair, so when userspace querying queue pair numbers, it should return
>>>> mq=1 than zero.
>>>>
>>>> Function vdpa_dev_net_config_fill() fills the attributions of the
>>>> vDPA devices, so that it should call vdpa_dev_net_mq_config_fill() so
>>>> the parameter in vdpa_dev_net_mq_config_fill() should be
>>>> feature_device than feature_driver for the vDPA devices themselves
>>>>
>>>> Before this change, when MQ = 0, iproute2 output:
>>>> $vdpa dev config show vdpa0
>>>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs
>>>> 0 mtu 1500
>>>>
>>> The fix belongs to user space.
>>> When a feature bit _MQ is not negotiated, vdpa kernel space will not add
>> attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
>>> When such attribute is not returned by kernel, max_vq_pairs should not
>> be shown by the iproute2.
>> I think userspace tool does not need to care whether MQ is offered or
>> negotiated, it just needs to read the number of queues there, so if no MQ, it
>> is not "not any queues", there are still 1 queue pair to be a virtio-net device,
>> means two queues.
>>
>> If not, how can you tell the user there are only 2 queues? The end users may
>> don't know this is default. They may misunderstand this as an error or
>> defects.
> When max_vq_pairs is not shown, it means that device didn’t expose MAX_VQ_PAIRS attribute to its guest users.
> (Because _MQ was not negotiated).
> It is not error or defect.
> It precisely shows what is exposed.
>
> User space will care when it wants to turn off/on _MQ feature bits and MAX_QP values.
>
> Showing max_vq_pairs of 1 even when _MQ is not negotiated, incorrectly says that max_vq_pairs is exposed to the guest, but it is not offered.
>
> So, please fix the iproute2 to not print max_vq_pairs when it is not returned by the kernel.
iproute2 can report whether there is MQ feature in the device / driver 
feature bits.
I think iproute2 only queries the number of max queues here.

max_vq_pairs shows how many queue pairs there, this attribute's existence does not depend on MQ,
if no MQ, there are still one queue pair, so just show one.

>
>>> We have many config space fields that depend on the feature bits and
>> some of them do not have any defaults.
>>> To keep consistency of existence of config space fields among all, we don't
>> want to show default like below.
>>> Please fix the iproute2 to not print max_vq_pairs when it is not returned
>> by the kernel.
>>>> After applying this commit, when MQ = 0, iproute2 output:
>>>> $vdpa dev config show vdpa0
>>>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs
>>>> 1 mtu 1500
>>>>
>>>> Fixes: a64917bc2e9b (vdpa: Provide interface to read driver features)
>>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>> ---
>>>>    drivers/vdpa/vdpa.c | 7 ++++---
>>>>    1 file changed, 4 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>> d76b22b2f7ae..846dd37f3549 100644
>>>> --- a/drivers/vdpa/vdpa.c
>>>> +++ b/drivers/vdpa/vdpa.c
>>>> @@ -806,9 +806,10 @@ static int vdpa_dev_net_mq_config_fill(struct
>>>> vdpa_device *vdev,
>>>>    	u16 val_u16;
>>>>
>>>>    	if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0)
>>>> -		return 0;
>>>> +		val_u16 = 1;
>>>> +	else
>>>> +		val_u16 = __virtio16_to_cpu(true, config-
>>>>> max_virtqueue_pairs);
>>>> -	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
>>>>    	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
>> val_u16);
>>>> }
>>>>
>>>> @@ -842,7 +843,7 @@ static int vdpa_dev_net_config_fill(struct
>>>> vdpa_device *vdev, struct sk_buff *ms
>>>>    			      VDPA_ATTR_PAD))
>>>>    		return -EMSGSIZE;
>>>>
>>>> -	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver,
>>>> &config);
>>>> +	return vdpa_dev_net_mq_config_fill(vdev, msg, features_device,
>>>> +&config);
>>>>    }
>>>>
>>>>    static int
>>>> --
>>>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-11  2:29         ` Zhu, Lingshan
@ 2022-07-12 16:48           ` Parav Pandit
  2022-07-13  3:03             ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Parav Pandit @ 2022-07-12 16:48 UTC (permalink / raw)
  To: Zhu, Lingshan, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Sunday, July 10, 2022 10:30 PM

> > Showing max_vq_pairs of 1 even when _MQ is not negotiated, incorrectly
> says that max_vq_pairs is exposed to the guest, but it is not offered.
> >
> > So, please fix the iproute2 to not print max_vq_pairs when it is not
> returned by the kernel.
> iproute2 can report whether there is MQ feature in the device / driver
> feature bits.
> I think iproute2 only queries the number of max queues here.
> 
> max_vq_pairs shows how many queue pairs there, this attribute's existence
> does not depend on MQ, if no MQ, there are still one queue pair, so just
> show one.
This netlink attribute's existence is depending on the _MQ feature bit existence.
We can break that and report the value, but if we break that there are many other config space bits who doesn’t have good default like max_vq_pairs.
There is ambiguity for user space what to do with it and so in the kernel space..
Instead of dealing with them differently in kernel, at present we attach each netlink attribute to a respective feature bit wherever applicable.
And code in kernel and user space is uniform to handle them.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-12 16:48           ` Parav Pandit
@ 2022-07-13  3:03             ` Zhu, Lingshan
  2022-07-13  3:06               ` Parav Pandit
  0 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-13  3:03 UTC (permalink / raw)
  To: Parav Pandit, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/13/2022 12:48 AM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Sunday, July 10, 2022 10:30 PM
>>> Showing max_vq_pairs of 1 even when _MQ is not negotiated, incorrectly
>> says that max_vq_pairs is exposed to the guest, but it is not offered.
>>> So, please fix the iproute2 to not print max_vq_pairs when it is not
>> returned by the kernel.
>> iproute2 can report whether there is MQ feature in the device / driver
>> feature bits.
>> I think iproute2 only queries the number of max queues here.
>>
>> max_vq_pairs shows how many queue pairs there, this attribute's existence
>> does not depend on MQ, if no MQ, there are still one queue pair, so just
>> show one.
> This netlink attribute's existence is depending on the _MQ feature bit existence.
why? If no MQ, then no queues?
> We can break that and report the value, but if we break that there are many other config space bits who doesn’t have good default like max_vq_pairs.
max_vq_paris may not have a default value, but we know if there is no 
MQ, a virtio-net still have one queue pair to be functional.
> There is ambiguity for user space what to do with it and so in the kernel space..
> Instead of dealing with them differently in kernel, at present we attach each netlink attribute to a respective feature bit wherever applicable.
> And code in kernel and user space is uniform to handle them.
I get your point, but you see, by "max_vq_pairs", the user space tool is 
asking how many queue pairs there, it is not asking whether the device 
have MQ.
Even no _MQ, we still need to tell the users that there are one queue 
pair, or it is not a functional virtio-net,
we should detect this error earlier in the device initialization.

I think it is still uniform, it there is _MQ, we return 
cfg.max_queue_pair, if no _MQ, return 1, still by netlink.

Thanks


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-13  3:03             ` Zhu, Lingshan
@ 2022-07-13  3:06               ` Parav Pandit
  2022-07-13  3:45                 ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Parav Pandit @ 2022-07-13  3:06 UTC (permalink / raw)
  To: Zhu, Lingshan, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Tuesday, July 12, 2022 11:03 PM
> 
> 
> On 7/13/2022 12:48 AM, Parav Pandit wrote:
> >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >> Sent: Sunday, July 10, 2022 10:30 PM
> >>> Showing max_vq_pairs of 1 even when _MQ is not negotiated,
> >>> incorrectly
> >> says that max_vq_pairs is exposed to the guest, but it is not offered.
> >>> So, please fix the iproute2 to not print max_vq_pairs when it is not
> >> returned by the kernel.
> >> iproute2 can report whether there is MQ feature in the device /
> >> driver feature bits.
> >> I think iproute2 only queries the number of max queues here.
> >>
> >> max_vq_pairs shows how many queue pairs there, this attribute's
> >> existence does not depend on MQ, if no MQ, there are still one queue
> >> pair, so just show one.
> > This netlink attribute's existence is depending on the _MQ feature bit
> existence.
> why? If no MQ, then no queues?
> > We can break that and report the value, but if we break that there are
> many other config space bits who doesn’t have good default like
> max_vq_pairs.
> max_vq_paris may not have a default value, but we know if there is no MQ,
> a virtio-net still have one queue pair to be functional.
> > There is ambiguity for user space what to do with it and so in the kernel
> space..
> > Instead of dealing with them differently in kernel, at present we attach
> each netlink attribute to a respective feature bit wherever applicable.
> > And code in kernel and user space is uniform to handle them.
> I get your point, but you see, by "max_vq_pairs", the user space tool is
> asking how many queue pairs there, it is not asking whether the device have
> MQ.
> Even no _MQ, we still need to tell the users that there are one queue pair, or
> it is not a functional virtio-net, we should detect this error earlier in the
> device initialization.
It is not an error. :)

When the user space which invokes netlink commands, detects that _MQ is not supported, hence it takes max_queue_pair = 1 by itself.

> 
> I think it is still uniform, it there is _MQ, we return cfg.max_queue_pair, if no
> _MQ, return 1, still by netlink.
Better to do that in user space because we cannot do same for other config fields.

> 
> Thanks


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-13  3:06               ` Parav Pandit
@ 2022-07-13  3:45                 ` Zhu, Lingshan
  2022-07-26 15:56                   ` Parav Pandit
  0 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-13  3:45 UTC (permalink / raw)
  To: Parav Pandit, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/13/2022 11:06 AM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Tuesday, July 12, 2022 11:03 PM
>>
>>
>> On 7/13/2022 12:48 AM, Parav Pandit wrote:
>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>> Sent: Sunday, July 10, 2022 10:30 PM
>>>>> Showing max_vq_pairs of 1 even when _MQ is not negotiated,
>>>>> incorrectly
>>>> says that max_vq_pairs is exposed to the guest, but it is not offered.
>>>>> So, please fix the iproute2 to not print max_vq_pairs when it is not
>>>> returned by the kernel.
>>>> iproute2 can report whether there is MQ feature in the device /
>>>> driver feature bits.
>>>> I think iproute2 only queries the number of max queues here.
>>>>
>>>> max_vq_pairs shows how many queue pairs there, this attribute's
>>>> existence does not depend on MQ, if no MQ, there are still one queue
>>>> pair, so just show one.
>>> This netlink attribute's existence is depending on the _MQ feature bit
>> existence.
>> why? If no MQ, then no queues?
>>> We can break that and report the value, but if we break that there are
>> many other config space bits who doesn’t have good default like
>> max_vq_pairs.
>> max_vq_paris may not have a default value, but we know if there is no MQ,
>> a virtio-net still have one queue pair to be functional.
>>> There is ambiguity for user space what to do with it and so in the kernel
>> space..
>>> Instead of dealing with them differently in kernel, at present we attach
>> each netlink attribute to a respective feature bit wherever applicable.
>>> And code in kernel and user space is uniform to handle them.
>> I get your point, but you see, by "max_vq_pairs", the user space tool is
>> asking how many queue pairs there, it is not asking whether the device have
>> MQ.
>> Even no _MQ, we still need to tell the users that there are one queue pair, or
>> it is not a functional virtio-net, we should detect this error earlier in the
>> device initialization.
> It is not an error. :)
I meant if no queues, it should be non-functional, which is an error.
>
> When the user space which invokes netlink commands, detects that _MQ is not supported, hence it takes max_queue_pair = 1 by itself.
I think the kernel module have all necessary information and it is the 
only one which have precise information of a device, so it
should answer precisely than let the user space guess. The kernel module 
should be reliable than stay silent, leave the question
to the user space tool.
>
>> I think it is still uniform, it there is _MQ, we return cfg.max_queue_pair, if no
>> _MQ, return 1, still by netlink.
> Better to do that in user space because we cannot do same for other config fields.
same as above
>
>> Thanks


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-07-01 22:12   ` Parav Pandit
  2022-07-08  6:22     ` Zhu, Lingshan
@ 2022-07-13  5:23     ` Michael S. Tsirkin
  2022-07-13  7:46       ` Zhu, Lingshan
       [not found]     ` <00889067-50ac-d2cd-675f-748f171e5c83@oracle.com>
  2 siblings, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-13  5:23 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Zhu Lingshan, jasowang, virtualization, netdev, xieyongji, gautam.dawar

On Fri, Jul 01, 2022 at 10:12:49PM +0000, Parav Pandit wrote:
> 
> 
> > From: Zhu Lingshan <lingshan.zhu@intel.com>
> > Sent: Friday, July 1, 2022 9:28 AM
> > 
> > Users may want to query the config space of a vDPA device, to choose a
> > appropriate one for a certain guest. This means the users need to read the
> > config space before FEATURES_OK, and the existence of config space
> > contents does not depend on FEATURES_OK.
> > 
> > The spec says:
> > The device MUST allow reading of any device-specific configuration field
> > before FEATURES_OK is set by the driver. This includes fields which are
c> > conditional on feature bits, as long as those feature bits are offered by the
> > device.
> > 
> > Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
> Fix is fine, but fixes tag needs correction described below.
> 
> Above commit id is 13 letters should be 12.
> And 
> It should be in format
> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if FEATURES_OK")

Yea you normally use

--format='Fixes: %h (\"%s\")'


> Please use checkpatch.pl script before posting the patches to catch these errors.
> There is a bot that looks at the fixes tag and identifies the right kernel version to apply this fix.


I don't think checkpatch complains about this if for no other reason
that sometimes the 6 byte hash is not enough.

> > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > ---
> >  drivers/vdpa/vdpa.c | 8 --------
> >  1 file changed, 8 deletions(-)
> > 
> > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> > 9b0e39b2f022..d76b22b2f7ae 100644
> > --- a/drivers/vdpa/vdpa.c
> > +++ b/drivers/vdpa/vdpa.c
> > @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
> > struct sk_buff *msg, u32 portid,  {
> >  	u32 device_id;
> >  	void *hdr;
> > -	u8 status;
> >  	int err;
> > 
> >  	down_read(&vdev->cf_lock);
> > -	status = vdev->config->get_status(vdev);
> > -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
> > -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
> > completed");
> > -		err = -EAGAIN;
> > -		goto out;
> > -	}
> > -
> >  	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
> >  			  VDPA_CMD_DEV_CONFIG_GET);
> >  	if (!hdr) {
> > --
> > 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-01 22:07   ` Parav Pandit
  2022-07-08  6:21     ` Zhu, Lingshan
@ 2022-07-13  5:26     ` Michael S. Tsirkin
  2022-07-13  7:47       ` Zhu, Lingshan
  2022-07-26 15:54       ` Parav Pandit
  1 sibling, 2 replies; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-13  5:26 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Zhu Lingshan, jasowang, virtualization, netdev, xieyongji, gautam.dawar

On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
> 
> 
> > From: Zhu Lingshan <lingshan.zhu@intel.com>
> > Sent: Friday, July 1, 2022 9:28 AM
> > If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue pair, so
> > when userspace querying queue pair numbers, it should return mq=1 than
> > zero.
> > 
> > Function vdpa_dev_net_config_fill() fills the attributions of the vDPA
> > devices, so that it should call vdpa_dev_net_mq_config_fill() so the
> > parameter in vdpa_dev_net_mq_config_fill() should be feature_device than
> > feature_driver for the vDPA devices themselves
> > 
> > Before this change, when MQ = 0, iproute2 output:
> > $vdpa dev config show vdpa0
> > vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 0
> > mtu 1500
> >
> The fix belongs to user space.
> When a feature bit _MQ is not negotiated, vdpa kernel space will not add attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
> When such attribute is not returned by kernel, max_vq_pairs should not be shown by the iproute2.
> 
> We have many config space fields that depend on the feature bits and some of them do not have any defaults.
> To keep consistency of existence of config space fields among all, we don't want to show default like below.
> 
> Please fix the iproute2 to not print max_vq_pairs when it is not returned by the kernel.

Parav I read the discussion and don't get your argument. From driver's POV
_MQ with 1 VQ pair and !_MQ are exactly functionally equivalent.

It's true that iproute probably needs to be fixed too, to handle old
kernels. But iproute is not the only userspace, why not make it's life
easier by fixing the kernel?

-- 
MST


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation
  2022-07-01 13:28 ` [PATCH V3 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation Zhu Lingshan
  2022-07-04  4:39   ` Jason Wang
@ 2022-07-13  5:31   ` Michael S. Tsirkin
  2022-07-13  7:48     ` Zhu, Lingshan
  1 sibling, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-13  5:31 UTC (permalink / raw)
  To: Zhu Lingshan
  Cc: jasowang, virtualization, netdev, parav, xieyongji, gautam.dawar

On Fri, Jul 01, 2022 at 09:28:21PM +0800, Zhu Lingshan wrote:
> ifcvf_get_config_size() should return a virtio device type specific value,
> however the ret_value should not be greater than the onboard size of
> the device implementation. E.g., for virtio_net, config_size should be
> the minimum value of sizeof(struct virtio_net_config) and the onboard
> cap size.
> 
> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> ---
>  drivers/vdpa/ifcvf/ifcvf_base.c | 13 +++++++++++--
>  drivers/vdpa/ifcvf/ifcvf_base.h |  2 ++
>  2 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
> index 48c4dadb0c7c..fb957b57941e 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_base.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_base.c
> @@ -128,6 +128,7 @@ int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *pdev)
>  			break;
>  		case VIRTIO_PCI_CAP_DEVICE_CFG:
>  			hw->dev_cfg = get_cap_addr(hw, &cap);
> +			hw->cap_dev_config_size = le32_to_cpu(cap.length);
>  			IFCVF_DBG(pdev, "hw->dev_cfg = %p\n", hw->dev_cfg);
>  			break;
>  		}
> @@ -233,15 +234,23 @@ int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features)
>  u32 ifcvf_get_config_size(struct ifcvf_hw *hw)
>  {
>  	struct ifcvf_adapter *adapter;
> +	u32 net_config_size = sizeof(struct virtio_net_config);
> +	u32 blk_config_size = sizeof(struct virtio_blk_config);
> +	u32 cap_size = hw->cap_dev_config_size;
>  	u32 config_size;
>  
>  	adapter = vf_to_adapter(hw);
> +	/* If the onboard device config space size is greater than
> +	 * the size of struct virtio_net/blk_config, only the spec
> +	 * implementing contents size is returned, this is very
> +	 * unlikely, defensive programming.
> +	 */
>  	switch (hw->dev_type) {
>  	case VIRTIO_ID_NET:
> -		config_size = sizeof(struct virtio_net_config);
> +		config_size = cap_size >= net_config_size ? net_config_size : cap_size;
>  		break;
>  	case VIRTIO_ID_BLOCK:
> -		config_size = sizeof(struct virtio_blk_config);
> +		config_size = cap_size >= blk_config_size ? blk_config_size : cap_size;
>  		break;
>  	default:
>  		config_size = 0;

There's a min macro for this.

> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
> index 115b61f4924b..f5563f665cc6 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_base.h
> +++ b/drivers/vdpa/ifcvf/ifcvf_base.h
> @@ -87,6 +87,8 @@ struct ifcvf_hw {
>  	int config_irq;
>  	int vqs_reused_irq;
>  	u16 nr_vring;
> +	/* VIRTIO_PCI_CAP_DEVICE_CFG size */
> +	u32 cap_dev_config_size;
>  };
>  
>  struct ifcvf_adapter {
> -- 
> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation
  2022-07-08  6:44     ` Zhu, Lingshan
@ 2022-07-13  5:44       ` Michael S. Tsirkin
  2022-07-13  7:52         ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-13  5:44 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: Jason Wang, virtualization, netdev, Parav Pandit, Yongji Xie,
	Dawar, Gautam

On Fri, Jul 08, 2022 at 02:44:08PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 7/4/2022 12:39 PM, Jason Wang wrote:
> > On Fri, Jul 1, 2022 at 9:36 PM Zhu Lingshan <lingshan.zhu@intel.com> wrote:
> > > ifcvf_get_config_size() should return a virtio device type specific value,
> > > however the ret_value should not be greater than the onboard size of
> > > the device implementation. E.g., for virtio_net, config_size should be
> > > the minimum value of sizeof(struct virtio_net_config) and the onboard
> > > cap size.
> > Rethink of this, I wonder what's the value of exposing device
> > implementation details to users? Anyhow the parent is in charge of
> > "emulating" config space accessing.
> This will not be exposed to the users, it is a ifcvf internal helper,
> to get the actual device config space size.
> 
> For example, if ifcvf drives an Intel virtio-net device,
> if the device config space size is greater than sizeof(struct
> virtio_net_cfg),
> this means the device has something more than the spec, some private fields,
> we don't want to expose these extra private fields to the users, so in this
> case,
> we only return what the spec defines.

This is kind of already the case.

> If the device config space size is less than sizeof(struct virtio_net_cfg),
> means the device didn't implement all fields the spec defined, like no RSS.
> In such cases, we only return what the device implemented.
> So these are defensive programming.

I think the issue you are describing is simply this.


Driver must not access BAR outside capability length. Current code
does not verify that it does not. Not the case for the current
devices but it's best to be safe against the case where
device does not implement some of the capability.


From that POV I think the patch is good, just fix the log.



> > 
> > If we do this, it's probably a blocker for cross vendor stuff.
> > 
> > Thanks
> > 
> > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > > ---
> > >   drivers/vdpa/ifcvf/ifcvf_base.c | 13 +++++++++++--
> > >   drivers/vdpa/ifcvf/ifcvf_base.h |  2 ++
> > >   2 files changed, 13 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
> > > index 48c4dadb0c7c..fb957b57941e 100644
> > > --- a/drivers/vdpa/ifcvf/ifcvf_base.c
> > > +++ b/drivers/vdpa/ifcvf/ifcvf_base.c
> > > @@ -128,6 +128,7 @@ int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *pdev)
> > >                          break;
> > >                  case VIRTIO_PCI_CAP_DEVICE_CFG:
> > >                          hw->dev_cfg = get_cap_addr(hw, &cap);
> > > +                       hw->cap_dev_config_size = le32_to_cpu(cap.length);
> > >                          IFCVF_DBG(pdev, "hw->dev_cfg = %p\n", hw->dev_cfg);
> > >                          break;
> > >                  }
> > > @@ -233,15 +234,23 @@ int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features)
> > >   u32 ifcvf_get_config_size(struct ifcvf_hw *hw)
> > >   {
> > >          struct ifcvf_adapter *adapter;
> > > +       u32 net_config_size = sizeof(struct virtio_net_config);
> > > +       u32 blk_config_size = sizeof(struct virtio_blk_config);
> > > +       u32 cap_size = hw->cap_dev_config_size;
> > >          u32 config_size;
> > > 
> > >          adapter = vf_to_adapter(hw);
> > > +       /* If the onboard device config space size is greater than
> > > +        * the size of struct virtio_net/blk_config, only the spec
> > > +        * implementing contents size is returned, this is very
> > > +        * unlikely, defensive programming.
> > > +        */
> > >          switch (hw->dev_type) {
> > >          case VIRTIO_ID_NET:
> > > -               config_size = sizeof(struct virtio_net_config);
> > > +               config_size = cap_size >= net_config_size ? net_config_size : cap_size;
> > >                  break;
> > >          case VIRTIO_ID_BLOCK:
> > > -               config_size = sizeof(struct virtio_blk_config);
> > > +               config_size = cap_size >= blk_config_size ? blk_config_size : cap_size;
> > >                  break;
> > >          default:
> > >                  config_size = 0;
> > > diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
> > > index 115b61f4924b..f5563f665cc6 100644
> > > --- a/drivers/vdpa/ifcvf/ifcvf_base.h
> > > +++ b/drivers/vdpa/ifcvf/ifcvf_base.h
> > > @@ -87,6 +87,8 @@ struct ifcvf_hw {
> > >          int config_irq;
> > >          int vqs_reused_irq;
> > >          u16 nr_vring;
> > > +       /* VIRTIO_PCI_CAP_DEVICE_CFG size */
> > > +       u32 cap_dev_config_size;
> > >   };
> > > 
> > >   struct ifcvf_adapter {
> > > --
> > > 2.31.1
> > > 


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-07-13  5:23     ` Michael S. Tsirkin
@ 2022-07-13  7:46       ` Zhu, Lingshan
  0 siblings, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-13  7:46 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: jasowang, virtualization, netdev, xieyongji, gautam.dawar



On 7/13/2022 1:23 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 01, 2022 at 10:12:49PM +0000, Parav Pandit wrote:
>>
>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>>> Sent: Friday, July 1, 2022 9:28 AM
>>>
>>> Users may want to query the config space of a vDPA device, to choose a
>>> appropriate one for a certain guest. This means the users need to read the
>>> config space before FEATURES_OK, and the existence of config space
>>> contents does not depend on FEATURES_OK.
>>>
>>> The spec says:
>>> The device MUST allow reading of any device-specific configuration field
>>> before FEATURES_OK is set by the driver. This includes fields which are
> c> > conditional on feature bits, as long as those feature bits are offered by the
>>> device.
yes
>>>
>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
>> Fix is fine, but fixes tag needs correction described below.
>>
>> Above commit id is 13 letters should be 12.
>> And
>> It should be in format
>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if FEATURES_OK")
> Yea you normally use
>
> --format='Fixes: %h (\"%s\")'
Thanks, but I will drop this fix tag, since Parav suggest I drop the fix 
tag of the 3/6 patch which reporting device
feature bits to the upserspace(this fix is composed of several patches).
>
>
>> Please use checkpatch.pl script before posting the patches to catch these errors.
>> There is a bot that looks at the fixes tag and identifies the right kernel version to apply this fix.
>
> I don't think checkpatch complains about this if for no other reason
> that sometimes the 6 byte hash is not enough.
>
>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>> ---
>>>   drivers/vdpa/vdpa.c | 8 --------
>>>   1 file changed, 8 deletions(-)
>>>
>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>> --- a/drivers/vdpa/vdpa.c
>>> +++ b/drivers/vdpa/vdpa.c
>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>>> struct sk_buff *msg, u32 portid,  {
>>>   	u32 device_id;
>>>   	void *hdr;
>>> -	u8 status;
>>>   	int err;
>>>
>>>   	down_read(&vdev->cf_lock);
>>> -	status = vdev->config->get_status(vdev);
>>> -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>> -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>> completed");
>>> -		err = -EAGAIN;
>>> -		goto out;
>>> -	}
>>> -
>>>   	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>   			  VDPA_CMD_DEV_CONFIG_GET);
>>>   	if (!hdr) {
>>> --
>>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-13  5:26     ` Michael S. Tsirkin
@ 2022-07-13  7:47       ` Zhu, Lingshan
  2022-07-26 15:54       ` Parav Pandit
  1 sibling, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-13  7:47 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: jasowang, virtualization, netdev, xieyongji, gautam.dawar



On 7/13/2022 1:26 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
>>
>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>>> Sent: Friday, July 1, 2022 9:28 AM
>>> If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue pair, so
>>> when userspace querying queue pair numbers, it should return mq=1 than
>>> zero.
>>>
>>> Function vdpa_dev_net_config_fill() fills the attributions of the vDPA
>>> devices, so that it should call vdpa_dev_net_mq_config_fill() so the
>>> parameter in vdpa_dev_net_mq_config_fill() should be feature_device than
>>> feature_driver for the vDPA devices themselves
>>>
>>> Before this change, when MQ = 0, iproute2 output:
>>> $vdpa dev config show vdpa0
>>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 0
>>> mtu 1500
>>>
>> The fix belongs to user space.
>> When a feature bit _MQ is not negotiated, vdpa kernel space will not add attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
>> When such attribute is not returned by kernel, max_vq_pairs should not be shown by the iproute2.
>>
>> We have many config space fields that depend on the feature bits and some of them do not have any defaults.
>> To keep consistency of existence of config space fields among all, we don't want to show default like below.
>>
>> Please fix the iproute2 to not print max_vq_pairs when it is not returned by the kernel.
> Parav I read the discussion and don't get your argument. From driver's POV
> _MQ with 1 VQ pair and !_MQ are exactly functionally equivalent.
>
> It's true that iproute probably needs to be fixed too, to handle old
> kernels. But iproute is not the only userspace, why not make it's life
> easier by fixing the kernel?
I will fix iproute2 once this series settles down

Thanks,
Zhu Lingshan


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation
  2022-07-13  5:31   ` Michael S. Tsirkin
@ 2022-07-13  7:48     ` Zhu, Lingshan
  0 siblings, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-13  7:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtualization, netdev, parav, xieyongji, gautam.dawar



On 7/13/2022 1:31 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 01, 2022 at 09:28:21PM +0800, Zhu Lingshan wrote:
>> ifcvf_get_config_size() should return a virtio device type specific value,
>> however the ret_value should not be greater than the onboard size of
>> the device implementation. E.g., for virtio_net, config_size should be
>> the minimum value of sizeof(struct virtio_net_config) and the onboard
>> cap size.
>>
>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>> ---
>>   drivers/vdpa/ifcvf/ifcvf_base.c | 13 +++++++++++--
>>   drivers/vdpa/ifcvf/ifcvf_base.h |  2 ++
>>   2 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
>> index 48c4dadb0c7c..fb957b57941e 100644
>> --- a/drivers/vdpa/ifcvf/ifcvf_base.c
>> +++ b/drivers/vdpa/ifcvf/ifcvf_base.c
>> @@ -128,6 +128,7 @@ int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *pdev)
>>   			break;
>>   		case VIRTIO_PCI_CAP_DEVICE_CFG:
>>   			hw->dev_cfg = get_cap_addr(hw, &cap);
>> +			hw->cap_dev_config_size = le32_to_cpu(cap.length);
>>   			IFCVF_DBG(pdev, "hw->dev_cfg = %p\n", hw->dev_cfg);
>>   			break;
>>   		}
>> @@ -233,15 +234,23 @@ int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features)
>>   u32 ifcvf_get_config_size(struct ifcvf_hw *hw)
>>   {
>>   	struct ifcvf_adapter *adapter;
>> +	u32 net_config_size = sizeof(struct virtio_net_config);
>> +	u32 blk_config_size = sizeof(struct virtio_blk_config);
>> +	u32 cap_size = hw->cap_dev_config_size;
>>   	u32 config_size;
>>   
>>   	adapter = vf_to_adapter(hw);
>> +	/* If the onboard device config space size is greater than
>> +	 * the size of struct virtio_net/blk_config, only the spec
>> +	 * implementing contents size is returned, this is very
>> +	 * unlikely, defensive programming.
>> +	 */
>>   	switch (hw->dev_type) {
>>   	case VIRTIO_ID_NET:
>> -		config_size = sizeof(struct virtio_net_config);
>> +		config_size = cap_size >= net_config_size ? net_config_size : cap_size;
>>   		break;
>>   	case VIRTIO_ID_BLOCK:
>> -		config_size = sizeof(struct virtio_blk_config);
>> +		config_size = cap_size >= blk_config_size ? blk_config_size : cap_size;
>>   		break;
>>   	default:
>>   		config_size = 0;
> There's a min macro for this.
yes, a min macro is better.

Thanks,
Zhu Lingshan
>
>> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
>> index 115b61f4924b..f5563f665cc6 100644
>> --- a/drivers/vdpa/ifcvf/ifcvf_base.h
>> +++ b/drivers/vdpa/ifcvf/ifcvf_base.h
>> @@ -87,6 +87,8 @@ struct ifcvf_hw {
>>   	int config_irq;
>>   	int vqs_reused_irq;
>>   	u16 nr_vring;
>> +	/* VIRTIO_PCI_CAP_DEVICE_CFG size */
>> +	u32 cap_dev_config_size;
>>   };
>>   
>>   struct ifcvf_adapter {
>> -- 
>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation
  2022-07-13  5:44       ` Michael S. Tsirkin
@ 2022-07-13  7:52         ` Zhu, Lingshan
  0 siblings, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-13  7:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtualization, netdev, Parav Pandit, Yongji Xie,
	Dawar, Gautam



On 7/13/2022 1:44 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 08, 2022 at 02:44:08PM +0800, Zhu, Lingshan wrote:
>>
>> On 7/4/2022 12:39 PM, Jason Wang wrote:
>>> On Fri, Jul 1, 2022 at 9:36 PM Zhu Lingshan <lingshan.zhu@intel.com> wrote:
>>>> ifcvf_get_config_size() should return a virtio device type specific value,
>>>> however the ret_value should not be greater than the onboard size of
>>>> the device implementation. E.g., for virtio_net, config_size should be
>>>> the minimum value of sizeof(struct virtio_net_config) and the onboard
>>>> cap size.
>>> Rethink of this, I wonder what's the value of exposing device
>>> implementation details to users? Anyhow the parent is in charge of
>>> "emulating" config space accessing.
>> This will not be exposed to the users, it is a ifcvf internal helper,
>> to get the actual device config space size.
>>
>> For example, if ifcvf drives an Intel virtio-net device,
>> if the device config space size is greater than sizeof(struct
>> virtio_net_cfg),
>> this means the device has something more than the spec, some private fields,
>> we don't want to expose these extra private fields to the users, so in this
>> case,
>> we only return what the spec defines.
> This is kind of already the case.
>
>> If the device config space size is less than sizeof(struct virtio_net_cfg),
>> means the device didn't implement all fields the spec defined, like no RSS.
>> In such cases, we only return what the device implemented.
>> So these are defensive programming.
> I think the issue you are describing is simply this.
>
>
> Driver must not access BAR outside capability length. Current code
> does not verify that it does not. Not the case for the current
> devices but it's best to be safe against the case where
> device does not implement some of the capability.
>
>
>  From that POV I think the patch is good, just fix the log.
sure, I will do

Thanks,
Zhu Lingshan
>
>
>
>>> If we do this, it's probably a blocker for cross vendor stuff.
>>>
>>> Thanks
>>>
>>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>> ---
>>>>    drivers/vdpa/ifcvf/ifcvf_base.c | 13 +++++++++++--
>>>>    drivers/vdpa/ifcvf/ifcvf_base.h |  2 ++
>>>>    2 files changed, 13 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
>>>> index 48c4dadb0c7c..fb957b57941e 100644
>>>> --- a/drivers/vdpa/ifcvf/ifcvf_base.c
>>>> +++ b/drivers/vdpa/ifcvf/ifcvf_base.c
>>>> @@ -128,6 +128,7 @@ int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *pdev)
>>>>                           break;
>>>>                   case VIRTIO_PCI_CAP_DEVICE_CFG:
>>>>                           hw->dev_cfg = get_cap_addr(hw, &cap);
>>>> +                       hw->cap_dev_config_size = le32_to_cpu(cap.length);
>>>>                           IFCVF_DBG(pdev, "hw->dev_cfg = %p\n", hw->dev_cfg);
>>>>                           break;
>>>>                   }
>>>> @@ -233,15 +234,23 @@ int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features)
>>>>    u32 ifcvf_get_config_size(struct ifcvf_hw *hw)
>>>>    {
>>>>           struct ifcvf_adapter *adapter;
>>>> +       u32 net_config_size = sizeof(struct virtio_net_config);
>>>> +       u32 blk_config_size = sizeof(struct virtio_blk_config);
>>>> +       u32 cap_size = hw->cap_dev_config_size;
>>>>           u32 config_size;
>>>>
>>>>           adapter = vf_to_adapter(hw);
>>>> +       /* If the onboard device config space size is greater than
>>>> +        * the size of struct virtio_net/blk_config, only the spec
>>>> +        * implementing contents size is returned, this is very
>>>> +        * unlikely, defensive programming.
>>>> +        */
>>>>           switch (hw->dev_type) {
>>>>           case VIRTIO_ID_NET:
>>>> -               config_size = sizeof(struct virtio_net_config);
>>>> +               config_size = cap_size >= net_config_size ? net_config_size : cap_size;
>>>>                   break;
>>>>           case VIRTIO_ID_BLOCK:
>>>> -               config_size = sizeof(struct virtio_blk_config);
>>>> +               config_size = cap_size >= blk_config_size ? blk_config_size : cap_size;
>>>>                   break;
>>>>           default:
>>>>                   config_size = 0;
>>>> diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
>>>> index 115b61f4924b..f5563f665cc6 100644
>>>> --- a/drivers/vdpa/ifcvf/ifcvf_base.h
>>>> +++ b/drivers/vdpa/ifcvf/ifcvf_base.h
>>>> @@ -87,6 +87,8 @@ struct ifcvf_hw {
>>>>           int config_irq;
>>>>           int vqs_reused_irq;
>>>>           u16 nr_vring;
>>>> +       /* VIRTIO_PCI_CAP_DEVICE_CFG size */
>>>> +       u32 cap_dev_config_size;
>>>>    };
>>>>
>>>>    struct ifcvf_adapter {
>>>> --
>>>> 2.31.1
>>>>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-06  2:25                 ` Zhu, Lingshan
  2022-07-06  2:28                   ` Parav Pandit
@ 2022-07-23 11:27                   ` Zhu, Lingshan
  2022-07-24 15:23                     ` Parav Pandit
  1 sibling, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-23 11:27 UTC (permalink / raw)
  To: Parav Pandit, Jason Wang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/6/2022 10:25 AM, Zhu, Lingshan wrote:
>
>
> On 7/6/2022 1:01 AM, Parav Pandit wrote:
>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>> Sent: Tuesday, July 5, 2022 12:56 PM
>>>> Both can be queried simultaneously. Each will return their own 
>>>> feature bits
>>> using same attribute.
>>>> It wont lead to the race.
>>> How? It is just a piece of memory, xxxx[attr], do you see locks in
>>> nla_put_u64_64bit()? It is a typical race condition, data accessed 
>>> by multiple
>>> producers / consumers.
>> No. There is no race condition in here.
>> And new attribute enum by no means avoid any race.
>>
>> Data put using nla_put cannot be accessed until they are transferred.
> How this is guaranteed? Do you see errors when calling nla_put_xxx() 
> twice?
Parav, did you miss this?
>>
>>> And re-use a netlink attr is really confusing.
>> Please put comment for this variable explaining why it is shared for 
>> the exception.
>>
>> Before that lets start, can you share a real world example of when 
>> this feature bitmap will have different value than the mgmt. device 
>> bitmap value?
> For example,
> 1. When migrate the VM to a node which has a more resourceful device. 
> If the source side device does not have MQ, RSS or TSO feature, the 
> vDPA device assigned to the VM does not
> have MQ, RSS or TSO as well. When migrating to a node which has a 
> device with MQ, RSS or TSO, to provide a consistent network device to 
> the guest, to be transparent to the guest,
> we need to mask out MQ, RSS or TSO in the vDPA device when 
> provisioning. This is an example that management device may have 
> different feature bits than the vDPA device.
>
> 2.SIOV, if a virtio device is capable of managing SIOV devices, and it 
> exposes this capability by a feature bit(Like what I am doing in the 
> "transport virtqueue"),
> we don't want the SIOV ADIs have SIOV features, so the ADIs don't have 
> SIOV feature bit.
>
> Thanks
>>
>>>>> IMHO, I don't see any advantages of re-using this attr.
>>>> We don’t want to continue this mess of VDPA_DEV prefix for new
>>> attributes due to previous wrong naming.
>>> as you point out before, is is a wrong naming, we can't re-nmme it 
>>> because
>>> we don't want to break uAPI, so there needs a new attr, if you don't 
>>> like the
>>> name VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, it is more than
>>> welcome to suggest a new one
>>>
>>> Thanks
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-23 11:27                   ` Zhu, Lingshan
@ 2022-07-24 15:23                     ` Parav Pandit
  0 siblings, 0 replies; 113+ messages in thread
From: Parav Pandit @ 2022-07-24 15:23 UTC (permalink / raw)
  To: Zhu, Lingshan, Jason Wang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Saturday, July 23, 2022 7:27 AM
> 
> On 7/6/2022 10:25 AM, Zhu, Lingshan wrote:
> >
> >
> > On 7/6/2022 1:01 AM, Parav Pandit wrote:
> >>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>> Sent: Tuesday, July 5, 2022 12:56 PM
> >>>> Both can be queried simultaneously. Each will return their own
> >>>> feature bits
> >>> using same attribute.
> >>>> It wont lead to the race.
> >>> How? It is just a piece of memory, xxxx[attr], do you see locks in
> >>> nla_put_u64_64bit()? It is a typical race condition, data accessed
> >>> by multiple producers / consumers.
> >> No. There is no race condition in here.
> >> And new attribute enum by no means avoid any race.
> >>
> >> Data put using nla_put cannot be accessed until they are transferred.
> > How this is guaranteed? Do you see errors when calling nla_put_xxx()
> > twice?
> Parav, did you miss this?
It is not called twice and reading attribute and packing in nla message is not race condition.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-13  5:26     ` Michael S. Tsirkin
  2022-07-13  7:47       ` Zhu, Lingshan
@ 2022-07-26 15:54       ` Parav Pandit
  2022-07-26 19:48         ` Michael S. Tsirkin
  1 sibling, 1 reply; 113+ messages in thread
From: Parav Pandit @ 2022-07-26 15:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Zhu Lingshan, jasowang, virtualization, netdev, xieyongji, gautam.dawar


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, July 13, 2022 1:27 AM
> 
> On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Zhu Lingshan <lingshan.zhu@intel.com>
> > > Sent: Friday, July 1, 2022 9:28 AM
> > > If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue
> > > pair, so when userspace querying queue pair numbers, it should
> > > return mq=1 than zero.
> > >
> > > Function vdpa_dev_net_config_fill() fills the attributions of the
> > > vDPA devices, so that it should call vdpa_dev_net_mq_config_fill()
> > > so the parameter in vdpa_dev_net_mq_config_fill() should be
> > > feature_device than feature_driver for the vDPA devices themselves
> > >
> > > Before this change, when MQ = 0, iproute2 output:
> > > $vdpa dev config show vdpa0
> > > vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
> > > max_vq_pairs 0 mtu 1500
> > >
> > The fix belongs to user space.
> > When a feature bit _MQ is not negotiated, vdpa kernel space will not add
> attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
> > When such attribute is not returned by kernel, max_vq_pairs should not be
> shown by the iproute2.
> >
> > We have many config space fields that depend on the feature bits and
> some of them do not have any defaults.
> > To keep consistency of existence of config space fields among all, we don't
> want to show default like below.
> >
> > Please fix the iproute2 to not print max_vq_pairs when it is not returned by
> the kernel.
> 
> Parav I read the discussion and don't get your argument. From driver's POV
> _MQ with 1 VQ pair and !_MQ are exactly functionally equivalent.
But we are talking from user POV here.

> 
> It's true that iproute probably needs to be fixed too, to handle old kernels.
> But iproute is not the only userspace, why not make it's life easier by fixing
> the kernel?
Because it cannot be fixed for other config space fields which are control by feature bits those do not have any defaults.
So better to treat all in same way from user POV.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-13  3:45                 ` Zhu, Lingshan
@ 2022-07-26 15:56                   ` Parav Pandit
  2022-07-26 19:52                     ` Michael S. Tsirkin
  2022-07-27  2:14                     ` Zhu, Lingshan
  0 siblings, 2 replies; 113+ messages in thread
From: Parav Pandit @ 2022-07-26 15:56 UTC (permalink / raw)
  To: Zhu, Lingshan, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Tuesday, July 12, 2022 11:46 PM
> > When the user space which invokes netlink commands, detects that _MQ
> is not supported, hence it takes max_queue_pair = 1 by itself.
> I think the kernel module have all necessary information and it is the only
> one which have precise information of a device, so it should answer precisely
> than let the user space guess. The kernel module should be reliable than stay
> silent, leave the question to the user space tool.
Kernel is reliable. It doesn’t expose a config space field if the field doesn’t exist regardless of field should have default or no default.
User space should not guess either. User space gets to see if _MQ present/not present. If _MQ present than get reliable data from kernel.
If _MQ not present, it means this device has one VQ pair.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-26 15:54       ` Parav Pandit
@ 2022-07-26 19:48         ` Michael S. Tsirkin
  2022-07-26 20:53           ` Parav Pandit
  0 siblings, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-26 19:48 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Zhu Lingshan, jasowang, virtualization, netdev, xieyongji, gautam.dawar

On Tue, Jul 26, 2022 at 03:54:06PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, July 13, 2022 1:27 AM
> > 
> > On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: Zhu Lingshan <lingshan.zhu@intel.com>
> > > > Sent: Friday, July 1, 2022 9:28 AM
> > > > If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue
> > > > pair, so when userspace querying queue pair numbers, it should
> > > > return mq=1 than zero.
> > > >
> > > > Function vdpa_dev_net_config_fill() fills the attributions of the
> > > > vDPA devices, so that it should call vdpa_dev_net_mq_config_fill()
> > > > so the parameter in vdpa_dev_net_mq_config_fill() should be
> > > > feature_device than feature_driver for the vDPA devices themselves
> > > >
> > > > Before this change, when MQ = 0, iproute2 output:
> > > > $vdpa dev config show vdpa0
> > > > vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
> > > > max_vq_pairs 0 mtu 1500
> > > >
> > > The fix belongs to user space.
> > > When a feature bit _MQ is not negotiated, vdpa kernel space will not add
> > attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
> > > When such attribute is not returned by kernel, max_vq_pairs should not be
> > shown by the iproute2.
> > >
> > > We have many config space fields that depend on the feature bits and
> > some of them do not have any defaults.
> > > To keep consistency of existence of config space fields among all, we don't
> > want to show default like below.
> > >
> > > Please fix the iproute2 to not print max_vq_pairs when it is not returned by
> > the kernel.
> > 
> > Parav I read the discussion and don't get your argument. From driver's POV
> > _MQ with 1 VQ pair and !_MQ are exactly functionally equivalent.
> But we are talking from user POV here.

From spec POV there's just driver and device, user would be part of
driver here.

> > 
> > It's true that iproute probably needs to be fixed too, to handle old kernels.
> > But iproute is not the only userspace, why not make it's life easier by fixing
> > the kernel?
> Because it cannot be fixed for other config space fields which are control by feature bits those do not have any defaults.
> So better to treat all in same way from user POV.

Consistency is good for sure. What are these other fields though?
Can you give examples so I understand please?

-- 
MST


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-26 15:56                   ` Parav Pandit
@ 2022-07-26 19:52                     ` Michael S. Tsirkin
  2022-07-26 20:49                       ` Parav Pandit
  2022-07-27  2:14                     ` Zhu, Lingshan
  1 sibling, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-26 19:52 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Zhu, Lingshan, jasowang, virtualization, netdev, xieyongji, gautam.dawar

On Tue, Jul 26, 2022 at 03:56:32PM +0000, Parav Pandit wrote:
> 
> > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > Sent: Tuesday, July 12, 2022 11:46 PM
> > > When the user space which invokes netlink commands, detects that _MQ
> > is not supported, hence it takes max_queue_pair = 1 by itself.
> > I think the kernel module have all necessary information and it is the only
> > one which have precise information of a device, so it should answer precisely
> > than let the user space guess. The kernel module should be reliable than stay
> > silent, leave the question to the user space tool.
> Kernel is reliable. It doesn’t expose a config space field if the field doesn’t exist regardless of field should have default or no default.
> User space should not guess either. User space gets to see if _MQ present/not present. If _MQ present than get reliable data from kernel.
> If _MQ not present, it means this device has one VQ pair.

Yes that's fine. And if we just didn't return anything without MQ that
would be fine.  But IIUC netlink reports the # of pairs regardless, it
just puts 0 there.

-- 
MST


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-26 19:52                     ` Michael S. Tsirkin
@ 2022-07-26 20:49                       ` Parav Pandit
  0 siblings, 0 replies; 113+ messages in thread
From: Parav Pandit @ 2022-07-26 20:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Zhu, Lingshan, jasowang, virtualization, netdev, xieyongji, gautam.dawar


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, July 26, 2022 3:52 PM
> 
> On Tue, Jul 26, 2022 at 03:56:32PM +0000, Parav Pandit wrote:
> >
> > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > Sent: Tuesday, July 12, 2022 11:46 PM
> > > > When the user space which invokes netlink commands, detects that
> > > > _MQ
> > > is not supported, hence it takes max_queue_pair = 1 by itself.
> > > I think the kernel module have all necessary information and it is
> > > the only one which have precise information of a device, so it
> > > should answer precisely than let the user space guess. The kernel
> > > module should be reliable than stay silent, leave the question to the user
> space tool.
> > Kernel is reliable. It doesn’t expose a config space field if the field doesn’t exist
> regardless of field should have default or no default.
> > User space should not guess either. User space gets to see if _MQ present/not
> present. If _MQ present than get reliable data from kernel.
> > If _MQ not present, it means this device has one VQ pair.
> 
> Yes that's fine. And if we just didn't return anything without MQ that would be
> fine.  But IIUC netlink reports the # of pairs regardless, it just puts 0 there.
I read it differently at [1] which checks for the MQ feature bit.

[1] https://elixir.bootlin.com/linux/latest/source/drivers/vdpa/vdpa.c#L825

> 
> --
> MST


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-26 19:48         ` Michael S. Tsirkin
@ 2022-07-26 20:53           ` Parav Pandit
  2022-07-27  1:56             ` Zhu, Lingshan
  2022-07-27  2:11             ` Zhu, Lingshan
  0 siblings, 2 replies; 113+ messages in thread
From: Parav Pandit @ 2022-07-26 20:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Zhu Lingshan, jasowang, virtualization, netdev, xieyongji, gautam.dawar


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, July 26, 2022 3:49 PM
> 
> On Tue, Jul 26, 2022 at 03:54:06PM +0000, Parav Pandit wrote:
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Wednesday, July 13, 2022 1:27 AM
> > >
> > > On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
> > > >
> > > >
> > > > > From: Zhu Lingshan <lingshan.zhu@intel.com>
> > > > > Sent: Friday, July 1, 2022 9:28 AM If VIRTIO_NET_F_MQ == 0, the
> > > > > virtio device should have one queue pair, so when userspace
> > > > > querying queue pair numbers, it should return mq=1 than zero.
> > > > >
> > > > > Function vdpa_dev_net_config_fill() fills the attributions of
> > > > > the vDPA devices, so that it should call
> > > > > vdpa_dev_net_mq_config_fill() so the parameter in
> > > > > vdpa_dev_net_mq_config_fill() should be feature_device than
> > > > > feature_driver for the vDPA devices themselves
> > > > >
> > > > > Before this change, when MQ = 0, iproute2 output:
> > > > > $vdpa dev config show vdpa0
> > > > > vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
> > > > > max_vq_pairs 0 mtu 1500
> > > > >
> > > > The fix belongs to user space.
> > > > When a feature bit _MQ is not negotiated, vdpa kernel space will
> > > > not add
> > > attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
> > > > When such attribute is not returned by kernel, max_vq_pairs should
> > > > not be
> > > shown by the iproute2.
> > > >
> > > > We have many config space fields that depend on the feature bits
> > > > and
> > > some of them do not have any defaults.
> > > > To keep consistency of existence of config space fields among all,
> > > > we don't
> > > want to show default like below.
> > > >
> > > > Please fix the iproute2 to not print max_vq_pairs when it is not
> > > > returned by
> > > the kernel.
> > >
> > > Parav I read the discussion and don't get your argument. From
> > > driver's POV _MQ with 1 VQ pair and !_MQ are exactly functionally
> equivalent.
> > But we are talking from user POV here.
> 
> From spec POV there's just driver and device, user would be part of driver here.
User space application still need to inspect the _MQ bit to
> 
> > >
> > > It's true that iproute probably needs to be fixed too, to handle old kernels.
> > > But iproute is not the only userspace, why not make it's life easier
> > > by fixing the kernel?
> > Because it cannot be fixed for other config space fields which are control by
> feature bits those do not have any defaults.
> > So better to treat all in same way from user POV.
> 
> Consistency is good for sure. What are these other fields though?

> Can you give examples so I understand please?

speed only exists if VIRTIO_NET_F_SPEED_DUPLEX.
rss_max_key_size exists only if VIRTIO_NET_F_RSS exists.


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-26 20:53           ` Parav Pandit
@ 2022-07-27  1:56             ` Zhu, Lingshan
  2022-07-27  2:11             ` Zhu, Lingshan
  1 sibling, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-27  1:56 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin
  Cc: jasowang, virtualization, netdev, xieyongji, gautam.dawar



On 7/27/2022 4:53 AM, Parav Pandit wrote:
>> From: Michael S. Tsirkin <mst@redhat.com>
>> Sent: Tuesday, July 26, 2022 3:49 PM
>>
>> On Tue, Jul 26, 2022 at 03:54:06PM +0000, Parav Pandit wrote:
>>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>> Sent: Wednesday, July 13, 2022 1:27 AM
>>>>
>>>> On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
>>>>>
>>>>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>> Sent: Friday, July 1, 2022 9:28 AM If VIRTIO_NET_F_MQ == 0, the
>>>>>> virtio device should have one queue pair, so when userspace
>>>>>> querying queue pair numbers, it should return mq=1 than zero.
>>>>>>
>>>>>> Function vdpa_dev_net_config_fill() fills the attributions of
>>>>>> the vDPA devices, so that it should call
>>>>>> vdpa_dev_net_mq_config_fill() so the parameter in
>>>>>> vdpa_dev_net_mq_config_fill() should be feature_device than
>>>>>> feature_driver for the vDPA devices themselves
>>>>>>
>>>>>> Before this change, when MQ = 0, iproute2 output:
>>>>>> $vdpa dev config show vdpa0
>>>>>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
>>>>>> max_vq_pairs 0 mtu 1500
>>>>>>
>>>>> The fix belongs to user space.
>>>>> When a feature bit _MQ is not negotiated, vdpa kernel space will
>>>>> not add
>>>> attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
>>>>> When such attribute is not returned by kernel, max_vq_pairs should
>>>>> not be
>>>> shown by the iproute2.
>>>>> We have many config space fields that depend on the feature bits
>>>>> and
>>>> some of them do not have any defaults.
>>>>> To keep consistency of existence of config space fields among all,
>>>>> we don't
>>>> want to show default like below.
>>>>> Please fix the iproute2 to not print max_vq_pairs when it is not
>>>>> returned by
>>>> the kernel.
>>>>
>>>> Parav I read the discussion and don't get your argument. From
>>>> driver's POV _MQ with 1 VQ pair and !_MQ are exactly functionally
>> equivalent.
>>> But we are talking from user POV here.
>>  From spec POV there's just driver and device, user would be part of driver here.
> User space application still need to inspect the _MQ bit to

>>>> It's true that iproute probably needs to be fixed too, to handle old kernels.
>>>> But iproute is not the only userspace, why not make it's life easier
>>>> by fixing the kernel?
>>> Because it cannot be fixed for other config space fields which are control by
>> feature bits those do not have any defaults.
>>> So better to treat all in same way from user POV.
>> Consistency is good for sure. What are these other fields though?
>> Can you give examples so I understand please?
> speed only exists if VIRTIO_NET_F_SPEED_DUPLEX.
> rss_max_key_size exists only if VIRTIO_NET_F_RSS exists.
That's different cases from the MQ case.

There are no default values for speed and rss_max_key_size. And 
processing speed without VIRTIO_NET_F_SPEED_DUPLEX, or rss_max_key_size 
exists without VIRTIO_NET_F_RSS are meaningless.
But for MQ, if without MQ, we know it has to be 1 queue pair to be a 
functional virtio-net, and only one queue pair. This is meaningful.


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-26 20:53           ` Parav Pandit
  2022-07-27  1:56             ` Zhu, Lingshan
@ 2022-07-27  2:11             ` Zhu, Lingshan
  1 sibling, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-27  2:11 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin
  Cc: jasowang, virtualization, netdev, xieyongji, gautam.dawar



On 7/27/2022 4:53 AM, Parav Pandit wrote:
>> From: Michael S. Tsirkin<mst@redhat.com>
>> Sent: Tuesday, July 26, 2022 3:49 PM
>>
>> On Tue, Jul 26, 2022 at 03:54:06PM +0000, Parav Pandit wrote:
>>>> From: Michael S. Tsirkin<mst@redhat.com>
>>>> Sent: Wednesday, July 13, 2022 1:27 AM
>>>>
>>>> On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>> Sent: Friday, July 1, 2022 9:28 AM If VIRTIO_NET_F_MQ == 0, the
>>>>>> virtio device should have one queue pair, so when userspace
>>>>>> querying queue pair numbers, it should return mq=1 than zero.
>>>>>>
>>>>>> Function vdpa_dev_net_config_fill() fills the attributions of
>>>>>> the vDPA devices, so that it should call
>>>>>> vdpa_dev_net_mq_config_fill() so the parameter in
>>>>>> vdpa_dev_net_mq_config_fill() should be feature_device than
>>>>>> feature_driver for the vDPA devices themselves
>>>>>>
>>>>>> Before this change, when MQ = 0, iproute2 output:
>>>>>> $vdpa dev config show vdpa0
>>>>>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
>>>>>> max_vq_pairs 0 mtu 1500
>>>>>>
>>>>> The fix belongs to user space.
>>>>> When a feature bit _MQ is not negotiated, vdpa kernel space will
>>>>> not add
>>>> attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
>>>>> When such attribute is not returned by kernel, max_vq_pairs should
>>>>> not be
>>>> shown by the iproute2.
>>>>> We have many config space fields that depend on the feature bits
>>>>> and
>>>> some of them do not have any defaults.
>>>>> To keep consistency of existence of config space fields among all,
>>>>> we don't
>>>> want to show default like below.
>>>>> Please fix the iproute2 to not print max_vq_pairs when it is not
>>>>> returned by
>>>> the kernel.
>>>>
>>>> Parav I read the discussion and don't get your argument. From
>>>> driver's POV _MQ with 1 VQ pair and !_MQ are exactly functionally
>> equivalent.
>>> But we are talking from user POV here.
>>  From spec POV there's just driver and device, user would be part of driver here.
> User space application still need to inspect the _MQ bit to

>>>> It's true that iproute probably needs to be fixed too, to handle old kernels.
>>>> But iproute is not the only userspace, why not make it's life easier
>>>> by fixing the kernel?
>>> Because it cannot be fixed for other config space fields which are control by
>> feature bits those do not have any defaults.
>>> So better to treat all in same way from user POV.
>> Consistency is good for sure. What are these other fields though?
>> Can you give examples so I understand please?
> speed only exists if VIRTIO_NET_F_SPEED_DUPLEX.
> rss_max_key_size exists only if VIRTIO_NET_F_RSS exists.
That's different cases from the MQ case.

There are no default values for speed and rss_max_key_size. And talking 
on speed without VIRTIO_NET_F_SEPPD_DUPLEX or rss_max_key_size without 
VIRTIO_NET_F_RSS are meaningless.
But for MQ, if without MQ, we know it has to be 1 queue pair to be a 
functional virtio-net, and this is meaningful.


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-26 15:56                   ` Parav Pandit
  2022-07-26 19:52                     ` Michael S. Tsirkin
@ 2022-07-27  2:14                     ` Zhu, Lingshan
  2022-07-27  2:17                       ` Parav Pandit
  1 sibling, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-27  2:14 UTC (permalink / raw)
  To: Parav Pandit, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/26/2022 11:56 PM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Tuesday, July 12, 2022 11:46 PM
>>> When the user space which invokes netlink commands, detects that _MQ
>> is not supported, hence it takes max_queue_pair = 1 by itself.
>> I think the kernel module have all necessary information and it is the only
>> one which have precise information of a device, so it should answer precisely
>> than let the user space guess. The kernel module should be reliable than stay
>> silent, leave the question to the user space tool.
> Kernel is reliable. It doesn’t expose a config space field if the field doesn’t exist regardless of field should have default or no default.
so when you know it is one queue pair, you should answer one, not try to 
guess.
> User space should not guess either. User space gets to see if _MQ present/not present. If _MQ present than get reliable data from kernel.
> If _MQ not present, it means this device has one VQ pair.
it is still a guess, right? And all user space tools implemented this 
feature need to guess


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  2:14                     ` Zhu, Lingshan
@ 2022-07-27  2:17                       ` Parav Pandit
  2022-07-27  2:53                         ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Parav Pandit @ 2022-07-27  2:17 UTC (permalink / raw)
  To: Zhu, Lingshan, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Tuesday, July 26, 2022 10:15 PM
> 
> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >> Sent: Tuesday, July 12, 2022 11:46 PM
> >>> When the user space which invokes netlink commands, detects that
> _MQ
> >> is not supported, hence it takes max_queue_pair = 1 by itself.
> >> I think the kernel module have all necessary information and it is
> >> the only one which have precise information of a device, so it should
> >> answer precisely than let the user space guess. The kernel module
> >> should be reliable than stay silent, leave the question to the user space
> tool.
> > Kernel is reliable. It doesn’t expose a config space field if the field doesn’t
> exist regardless of field should have default or no default.
> so when you know it is one queue pair, you should answer one, not try to
> guess.
> > User space should not guess either. User space gets to see if _MQ
> present/not present. If _MQ present than get reliable data from kernel.
> > If _MQ not present, it means this device has one VQ pair.
> it is still a guess, right? And all user space tools implemented this feature
> need to guess
No. it is not a guess.
It is explicitly checking the _MQ feature and deriving the value.
The code you proposed will be present in the user space.
It will be uniform for _MQ and 10 other features that are present now and in the future.

For feature X, kernel reports default and for feature Y, kernel skip reporting it, because there is no default. <- This is what we are trying to avoid here.


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  2:17                       ` Parav Pandit
@ 2022-07-27  2:53                         ` Zhu, Lingshan
  2022-07-27  3:47                           ` Parav Pandit
  0 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-27  2:53 UTC (permalink / raw)
  To: Parav Pandit, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/27/2022 10:17 AM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Tuesday, July 26, 2022 10:15 PM
>>
>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>> When the user space which invokes netlink commands, detects that
>> _MQ
>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>> I think the kernel module have all necessary information and it is
>>>> the only one which have precise information of a device, so it should
>>>> answer precisely than let the user space guess. The kernel module
>>>> should be reliable than stay silent, leave the question to the user space
>> tool.
>>> Kernel is reliable. It doesn’t expose a config space field if the field doesn’t
>> exist regardless of field should have default or no default.
>> so when you know it is one queue pair, you should answer one, not try to
>> guess.
>>> User space should not guess either. User space gets to see if _MQ
>> present/not present. If _MQ present than get reliable data from kernel.
>>> If _MQ not present, it means this device has one VQ pair.
>> it is still a guess, right? And all user space tools implemented this feature
>> need to guess
> No. it is not a guess.
> It is explicitly checking the _MQ feature and deriving the value.
> The code you proposed will be present in the user space.
> It will be uniform for _MQ and 10 other features that are present now and in the future.
MQ and other features like RSS are different. If there is no _RSS_XX, 
there are no attributes like max_rss_key_size, and there is not a 
default value.
But for MQ, we know it has to be 1 wihtout _MQ.
> For feature X, kernel reports default and for feature Y, kernel skip reporting it, because there is no default. <- This is what we are trying to avoid here.
Kernel reports one queue pair because there is actually one.
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* RE: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  2:53                         ` Zhu, Lingshan
@ 2022-07-27  3:47                           ` Parav Pandit
  2022-07-27  4:24                             ` Zhu, Lingshan
  2022-07-27  6:01                             ` Michael S. Tsirkin
  0 siblings, 2 replies; 113+ messages in thread
From: Parav Pandit @ 2022-07-27  3:47 UTC (permalink / raw)
  To: Zhu, Lingshan, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Tuesday, July 26, 2022 10:53 PM
> 
> On 7/27/2022 10:17 AM, Parav Pandit wrote:
> >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >> Sent: Tuesday, July 26, 2022 10:15 PM
> >>
> >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> >>>>> When the user space which invokes netlink commands, detects that
> >> _MQ
> >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> >>>> I think the kernel module have all necessary information and it is
> >>>> the only one which have precise information of a device, so it
> >>>> should answer precisely than let the user space guess. The kernel
> >>>> module should be reliable than stay silent, leave the question to
> >>>> the user space
> >> tool.
> >>> Kernel is reliable. It doesn’t expose a config space field if the
> >>> field doesn’t
> >> exist regardless of field should have default or no default.
> >> so when you know it is one queue pair, you should answer one, not try
> >> to guess.
> >>> User space should not guess either. User space gets to see if _MQ
> >> present/not present. If _MQ present than get reliable data from kernel.
> >>> If _MQ not present, it means this device has one VQ pair.
> >> it is still a guess, right? And all user space tools implemented this
> >> feature need to guess
> > No. it is not a guess.
> > It is explicitly checking the _MQ feature and deriving the value.
> > The code you proposed will be present in the user space.
> > It will be uniform for _MQ and 10 other features that are present now and
> in the future.
> MQ and other features like RSS are different. If there is no _RSS_XX, there
> are no attributes like max_rss_key_size, and there is not a default value.
> But for MQ, we know it has to be 1 wihtout _MQ.
"we" = user space.
To keep the consistency among all the config space fields.

> > For feature X, kernel reports default and for feature Y, kernel skip
> reporting it, because there is no default. <- This is what we are trying to
> avoid here.
> Kernel reports one queue pair because there is actually one.
> >


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  3:47                           ` Parav Pandit
@ 2022-07-27  4:24                             ` Zhu, Lingshan
  2022-07-27  6:01                             ` Michael S. Tsirkin
  1 sibling, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-27  4:24 UTC (permalink / raw)
  To: Parav Pandit, jasowang, mst
  Cc: virtualization, netdev, xieyongji, gautam.dawar



On 7/27/2022 11:47 AM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Tuesday, July 26, 2022 10:53 PM
>>
>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>
>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>> When the user space which invokes netlink commands, detects that
>>>> _MQ
>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>> I think the kernel module have all necessary information and it is
>>>>>> the only one which have precise information of a device, so it
>>>>>> should answer precisely than let the user space guess. The kernel
>>>>>> module should be reliable than stay silent, leave the question to
>>>>>> the user space
>>>> tool.
>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>> field doesn’t
>>>> exist regardless of field should have default or no default.
>>>> so when you know it is one queue pair, you should answer one, not try
>>>> to guess.
>>>>> User space should not guess either. User space gets to see if _MQ
>>>> present/not present. If _MQ present than get reliable data from kernel.
>>>>> If _MQ not present, it means this device has one VQ pair.
>>>> it is still a guess, right? And all user space tools implemented this
>>>> feature need to guess
>>> No. it is not a guess.
>>> It is explicitly checking the _MQ feature and deriving the value.
>>> The code you proposed will be present in the user space.
>>> It will be uniform for _MQ and 10 other features that are present now and
>> in the future.
>> MQ and other features like RSS are different. If there is no _RSS_XX, there
>> are no attributes like max_rss_key_size, and there is not a default value.
>> But for MQ, we know it has to be 1 wihtout _MQ.
> "we" = user space.
> To keep the consistency among all the config space fields.
The user space tools asks for the number of vq pairs, not whether the 
device has _MQ.
_MQ and _RSS are not the same kind of concepts, as we have discussed above.
You have pointed out the logic: If there is _MQ, kernel answers 
max_vq_paris, if no _MQ, num_vq_paris=1.

So as MST pointed out, implementing this in kernel space can make our 
life easier, once for all.
>
>>> For feature X, kernel reports default and for feature Y, kernel skip
>> reporting it, because there is no default. <- This is what we are trying to
>> avoid here.
>> Kernel reports one queue pair because there is actually one.


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  3:47                           ` Parav Pandit
  2022-07-27  4:24                             ` Zhu, Lingshan
@ 2022-07-27  6:01                             ` Michael S. Tsirkin
  2022-07-27  6:25                               ` Zhu, Lingshan
                                                 ` (2 more replies)
  1 sibling, 3 replies; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-27  6:01 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Zhu, Lingshan, jasowang, virtualization, netdev, xieyongji, gautam.dawar

On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> 
> > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > Sent: Tuesday, July 26, 2022 10:53 PM
> > 
> > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >> Sent: Tuesday, July 26, 2022 10:15 PM
> > >>
> > >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > >>>>> When the user space which invokes netlink commands, detects that
> > >> _MQ
> > >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > >>>> I think the kernel module have all necessary information and it is
> > >>>> the only one which have precise information of a device, so it
> > >>>> should answer precisely than let the user space guess. The kernel
> > >>>> module should be reliable than stay silent, leave the question to
> > >>>> the user space
> > >> tool.
> > >>> Kernel is reliable. It doesn’t expose a config space field if the
> > >>> field doesn’t
> > >> exist regardless of field should have default or no default.
> > >> so when you know it is one queue pair, you should answer one, not try
> > >> to guess.
> > >>> User space should not guess either. User space gets to see if _MQ
> > >> present/not present. If _MQ present than get reliable data from kernel.
> > >>> If _MQ not present, it means this device has one VQ pair.
> > >> it is still a guess, right? And all user space tools implemented this
> > >> feature need to guess
> > > No. it is not a guess.
> > > It is explicitly checking the _MQ feature and deriving the value.
> > > The code you proposed will be present in the user space.
> > > It will be uniform for _MQ and 10 other features that are present now and
> > in the future.
> > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > are no attributes like max_rss_key_size, and there is not a default value.
> > But for MQ, we know it has to be 1 wihtout _MQ.
> "we" = user space.
> To keep the consistency among all the config space fields.

Actually I looked and the code some more and I'm puzzled:


	struct virtio_net_config config = {};
	u64 features;
	u16 val_u16;

	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));

	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
		    config.mac))
		return -EMSGSIZE;


Mac returned even without VIRTIO_NET_F_MAC


	val_u16 = le16_to_cpu(config.status);
	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
		return -EMSGSIZE;


status returned even without VIRTIO_NET_F_STATUS

	val_u16 = le16_to_cpu(config.mtu);
	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
		return -EMSGSIZE;


MTU returned even without VIRTIO_NET_F_MTU


What's going on here?


-- 
MST


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  6:01                             ` Michael S. Tsirkin
@ 2022-07-27  6:25                               ` Zhu, Lingshan
  2022-07-27  6:56                                 ` Jason Wang
  2022-07-27  6:54                               ` Jason Wang
  2022-07-27  7:50                               ` Si-Wei Liu
  2 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-27  6:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: jasowang, virtualization, netdev, xieyongji, gautam.dawar



On 7/27/2022 2:01 PM, Michael S. Tsirkin wrote:
> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>
>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>
>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>> When the user space which invokes netlink commands, detects that
>>>>> _MQ
>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>> I think the kernel module have all necessary information and it is
>>>>>>> the only one which have precise information of a device, so it
>>>>>>> should answer precisely than let the user space guess. The kernel
>>>>>>> module should be reliable than stay silent, leave the question to
>>>>>>> the user space
>>>>> tool.
>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>>> field doesn’t
>>>>> exist regardless of field should have default or no default.
>>>>> so when you know it is one queue pair, you should answer one, not try
>>>>> to guess.
>>>>>> User space should not guess either. User space gets to see if _MQ
>>>>> present/not present. If _MQ present than get reliable data from kernel.
>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>> it is still a guess, right? And all user space tools implemented this
>>>>> feature need to guess
>>>> No. it is not a guess.
>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>> The code you proposed will be present in the user space.
>>>> It will be uniform for _MQ and 10 other features that are present now and
>>> in the future.
>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
>>> are no attributes like max_rss_key_size, and there is not a default value.
>>> But for MQ, we know it has to be 1 wihtout _MQ.
>> "we" = user space.
>> To keep the consistency among all the config space fields.
> Actually I looked and the code some more and I'm puzzled:
I can submit a fix in my next version patch for these issue.
>
>
> 	struct virtio_net_config config = {};
> 	u64 features;
> 	u16 val_u16;
>
> 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>
> 	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> 		    config.mac))
> 		return -EMSGSIZE;
>
>
> Mac returned even without VIRTIO_NET_F_MAC
if no VIRTIO_NET_F_MAC, we should not nla_put 
VDPA_ATTR_DEV_NET_CFG_MAC_ADDR, the spec says the driver should generate 
a random mac.
>
>
> 	val_u16 = le16_to_cpu(config.status);
> 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> 		return -EMSGSIZE;
>
>
> status returned even without VIRTIO_NET_F_STATUS
if no VIRTIO_NET_F_STATUS, we should not nla_put 
VDPA_ATTR_DEV_NET_STATUS, the spec says the driver should assume the 
link is active.
>
> 	val_u16 = le16_to_cpu(config.mtu);
> 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> 		return -EMSGSIZE;
>
>
> MTU returned even without VIRTIO_NET_F_MTU
same as above, the spec says config.mtu depends on VIRTIO_NET_F_MTU, so 
without this feature bit, we should not return MTU to the userspace.

Does these fix look good to you?

And I think we may need your adjudication for the two issues:
(1) Shall we answer max_vq_paris = 1 when _MQ not exist, I know you have 
agreed on this in a previous thread, its nice to clarify
(2) I think we should not re-use the netlink attr to report feature bits 
of both the management device and the vDPA device,
this can lead to a new race condition, there are no locks(especially 
distributed locks for kernel_space and user_space) in the nla_put
functions. Re-using the attr is some kind of breaking the netlink 
lockless design.

Thanks,
Zhu Lingshan
>
>
> What's going on here?
>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  6:01                             ` Michael S. Tsirkin
  2022-07-27  6:25                               ` Zhu, Lingshan
@ 2022-07-27  6:54                               ` Jason Wang
  2022-07-27  9:02                                 ` Michael S. Tsirkin
  2022-07-27  7:50                               ` Si-Wei Liu
  2 siblings, 1 reply; 113+ messages in thread
From: Jason Wang @ 2022-07-27  6:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, Zhu, Lingshan, virtualization, netdev, xieyongji,
	gautam.dawar

On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> >
> > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > Sent: Tuesday, July 26, 2022 10:53 PM
> > >
> > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > >> Sent: Tuesday, July 26, 2022 10:15 PM
> > > >>
> > > >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > > >>>>> When the user space which invokes netlink commands, detects that
> > > >> _MQ
> > > >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > > >>>> I think the kernel module have all necessary information and it is
> > > >>>> the only one which have precise information of a device, so it
> > > >>>> should answer precisely than let the user space guess. The kernel
> > > >>>> module should be reliable than stay silent, leave the question to
> > > >>>> the user space
> > > >> tool.
> > > >>> Kernel is reliable. It doesn’t expose a config space field if the
> > > >>> field doesn’t
> > > >> exist regardless of field should have default or no default.
> > > >> so when you know it is one queue pair, you should answer one, not try
> > > >> to guess.
> > > >>> User space should not guess either. User space gets to see if _MQ
> > > >> present/not present. If _MQ present than get reliable data from kernel.
> > > >>> If _MQ not present, it means this device has one VQ pair.
> > > >> it is still a guess, right? And all user space tools implemented this
> > > >> feature need to guess
> > > > No. it is not a guess.
> > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > The code you proposed will be present in the user space.
> > > > It will be uniform for _MQ and 10 other features that are present now and
> > > in the future.
> > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > are no attributes like max_rss_key_size, and there is not a default value.
> > > But for MQ, we know it has to be 1 wihtout _MQ.
> > "we" = user space.
> > To keep the consistency among all the config space fields.
>
> Actually I looked and the code some more and I'm puzzled:
>
>
>         struct virtio_net_config config = {};
>         u64 features;
>         u16 val_u16;
>
>         vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>
>         if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
>                     config.mac))
>                 return -EMSGSIZE;
>
>
> Mac returned even without VIRTIO_NET_F_MAC
>
>
>         val_u16 = le16_to_cpu(config.status);
>         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>                 return -EMSGSIZE;
>
>
> status returned even without VIRTIO_NET_F_STATUS
>
>         val_u16 = le16_to_cpu(config.mtu);
>         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>                 return -EMSGSIZE;
>
>
> MTU returned even without VIRTIO_NET_F_MTU
>
>
> What's going on here?

Probably too late to fix, but this should be fine as long as all
parents support STATUS/MTU/MAC.

I wonder if we can add a check in the core and fail the device
registration in this case.

Thanks

>
>
> --
> MST
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  6:25                               ` Zhu, Lingshan
@ 2022-07-27  6:56                                 ` Jason Wang
  2022-07-27  9:05                                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 113+ messages in thread
From: Jason Wang @ 2022-07-27  6:56 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: Michael S. Tsirkin, Parav Pandit, virtualization, netdev,
	xieyongji, gautam.dawar

On Wed, Jul 27, 2022 at 2:26 PM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
>
>
>
> On 7/27/2022 2:01 PM, Michael S. Tsirkin wrote:
> > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> >>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>> Sent: Tuesday, July 26, 2022 10:53 PM
> >>>
> >>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
> >>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>>>> Sent: Tuesday, July 26, 2022 10:15 PM
> >>>>>
> >>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> >>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
> >>>>>>>> When the user space which invokes netlink commands, detects that
> >>>>> _MQ
> >>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> >>>>>>> I think the kernel module have all necessary information and it is
> >>>>>>> the only one which have precise information of a device, so it
> >>>>>>> should answer precisely than let the user space guess. The kernel
> >>>>>>> module should be reliable than stay silent, leave the question to
> >>>>>>> the user space
> >>>>> tool.
> >>>>>> Kernel is reliable. It doesn’t expose a config space field if the
> >>>>>> field doesn’t
> >>>>> exist regardless of field should have default or no default.
> >>>>> so when you know it is one queue pair, you should answer one, not try
> >>>>> to guess.
> >>>>>> User space should not guess either. User space gets to see if _MQ
> >>>>> present/not present. If _MQ present than get reliable data from kernel.
> >>>>>> If _MQ not present, it means this device has one VQ pair.
> >>>>> it is still a guess, right? And all user space tools implemented this
> >>>>> feature need to guess
> >>>> No. it is not a guess.
> >>>> It is explicitly checking the _MQ feature and deriving the value.
> >>>> The code you proposed will be present in the user space.
> >>>> It will be uniform for _MQ and 10 other features that are present now and
> >>> in the future.
> >>> MQ and other features like RSS are different. If there is no _RSS_XX, there
> >>> are no attributes like max_rss_key_size, and there is not a default value.
> >>> But for MQ, we know it has to be 1 wihtout _MQ.
> >> "we" = user space.
> >> To keep the consistency among all the config space fields.
> > Actually I looked and the code some more and I'm puzzled:
> I can submit a fix in my next version patch for these issue.
> >
> >
> >       struct virtio_net_config config = {};
> >       u64 features;
> >       u16 val_u16;
> >
> >       vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> >
> >       if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> >                   config.mac))
> >               return -EMSGSIZE;
> >
> >
> > Mac returned even without VIRTIO_NET_F_MAC
> if no VIRTIO_NET_F_MAC, we should not nla_put
> VDPA_ATTR_DEV_NET_CFG_MAC_ADDR, the spec says the driver should generate
> a random mac.

It's probably too late to do this. Most of the parents have this
feature support, so probably not a real issue.

> >
> >
> >       val_u16 = le16_to_cpu(config.status);
> >       if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> >               return -EMSGSIZE;
> >
> >
> > status returned even without VIRTIO_NET_F_STATUS
> if no VIRTIO_NET_F_STATUS, we should not nla_put
> VDPA_ATTR_DEV_NET_STATUS, the spec says the driver should assume the
> link is active.

Somehow similar to F_MAC. But we can report if F_MAC is not negotiated.


> >
> >       val_u16 = le16_to_cpu(config.mtu);
> >       if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> >               return -EMSGSIZE;
> >
> >
> > MTU returned even without VIRTIO_NET_F_MTU
> same as above, the spec says config.mtu depends on VIRTIO_NET_F_MTU, so
> without this feature bit, we should not return MTU to the userspace.

Not a big issue, we just need to make sure the parent can report a
correct MTU here.

Thanks

>
> Does these fix look good to you?
>
> And I think we may need your adjudication for the two issues:
> (1) Shall we answer max_vq_paris = 1 when _MQ not exist, I know you have
> agreed on this in a previous thread, its nice to clarify
> (2) I think we should not re-use the netlink attr to report feature bits
> of both the management device and the vDPA device,
> this can lead to a new race condition, there are no locks(especially
> distributed locks for kernel_space and user_space) in the nla_put
> functions. Re-using the attr is some kind of breaking the netlink
> lockless design.
>
> Thanks,
> Zhu Lingshan
> >
> >
> > What's going on here?
> >
> >
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  6:01                             ` Michael S. Tsirkin
  2022-07-27  6:25                               ` Zhu, Lingshan
  2022-07-27  6:54                               ` Jason Wang
@ 2022-07-27  7:50                               ` Si-Wei Liu
  2022-07-27  9:01                                 ` Michael S. Tsirkin
  2 siblings, 1 reply; 113+ messages in thread
From: Si-Wei Liu @ 2022-07-27  7:50 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: netdev, virtualization, xieyongji, gautam.dawar, Zhu, Lingshan



On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>
>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>
>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>> When the user space which invokes netlink commands, detects that
>>>>> _MQ
>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>> I think the kernel module have all necessary information and it is
>>>>>>> the only one which have precise information of a device, so it
>>>>>>> should answer precisely than let the user space guess. The kernel
>>>>>>> module should be reliable than stay silent, leave the question to
>>>>>>> the user space
>>>>> tool.
>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>>> field doesn’t
>>>>> exist regardless of field should have default or no default.
>>>>> so when you know it is one queue pair, you should answer one, not try
>>>>> to guess.
>>>>>> User space should not guess either. User space gets to see if _MQ
>>>>> present/not present. If _MQ present than get reliable data from kernel.
>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>> it is still a guess, right? And all user space tools implemented this
>>>>> feature need to guess
>>>> No. it is not a guess.
>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>> The code you proposed will be present in the user space.
>>>> It will be uniform for _MQ and 10 other features that are present now and
>>> in the future.
>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
>>> are no attributes like max_rss_key_size, and there is not a default value.
>>> But for MQ, we know it has to be 1 wihtout _MQ.
>> "we" = user space.
>> To keep the consistency among all the config space fields.
> Actually I looked and the code some more and I'm puzzled:
>
>
> 	struct virtio_net_config config = {};
> 	u64 features;
> 	u16 val_u16;
>
> 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>
> 	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> 		    config.mac))
> 		return -EMSGSIZE;
>
>
> Mac returned even without VIRTIO_NET_F_MAC
>
>
> 	val_u16 = le16_to_cpu(config.status);
> 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> 		return -EMSGSIZE;
>
>
> status returned even without VIRTIO_NET_F_STATUS
>
> 	val_u16 = le16_to_cpu(config.mtu);
> 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> 		return -EMSGSIZE;
>
>
> MTU returned even without VIRTIO_NET_F_MTU
>
>
> What's going on here?
>
>
I guess this is spec thing (historical debt), I vaguely recall these 
fields are always present in config space regardless the existence of 
corresponding feature bit.

-Siwei

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-05 11:56           ` Parav Pandit
  2022-07-05 16:56             ` Zhu, Lingshan
@ 2022-07-27  8:15             ` Si-Wei Liu
  2022-07-27 11:38               ` Zhu, Lingshan
  1 sibling, 1 reply; 113+ messages in thread
From: Si-Wei Liu @ 2022-07-27  8:15 UTC (permalink / raw)
  To: Parav Pandit, Zhu, Lingshan, Jason Wang, mst
  Cc: netdev, xieyongji, gautam.dawar, virtualization



On 7/5/2022 4:56 AM, Parav Pandit via Virtualization wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Tuesday, July 5, 2022 3:59 AM
>>
>>
>> On 7/4/2022 8:53 PM, Parav Pandit wrote:
>>>> From: Jason Wang <jasowang@redhat.com>
>>>> Sent: Monday, July 4, 2022 12:47 AM
>>>>
>>>>
>>>> 在 2022/7/2 06:02, Parav Pandit 写道:
>>>>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>
>>>>>> This commit adds a new vDPA netlink attribution
>>>>>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
>>>> features
>>>>>> of vDPA devices through this new attr.
>>>>>>
>>>>>> Fixes: a64917bc2e9b vdpa: (Provide interface to read driver
>>>>>> feature)
>>>>> Missing the "" in the line.
>>>>> I reviewed the patches again.
>>>>>
>>>>> However, this is not the fix.
>>>>> A fix cannot add a new UAPI.
>>>>>
>>>>> Code is already considering negotiated driver features to return the
>>>>> device
>>>> config space.
>>>>> Hence it is fine.
>>>>>
>>>>> This patch intents to provide device features to user space.
>>>>> First what vdpa device are capable of, are already returned by
>>>>> features
>>>> attribute on the management device.
>>>>> This is done in commit [1].
>>>>>
>>>>> The only reason to have it is, when one management device indicates
>>>>> that
>>>> feature is supported, but device may end up not supporting this
>>>> feature if such feature is shared with other devices on same physical device.
>>>>> For example all VFs may not be symmetric after large number of them
>>>>> are
>>>> in use. In such case features bit of management device can differ
>>>> (more
>>>> features) than the vdpa device of this VF.
>>>>> Hence, showing on the device is useful.
>>>>>
>>>>> As mentioned before in V2, commit [1] has wrongly named the
>>>>> attribute to
>>>> VDPA_ATTR_DEV_SUPPORTED_FEATURES.
>>>>> It should have been,
>>>> VDPA_ATTR_DEV_MGMTDEV_SUPPORTED_FEATURES.
>>>>> Because it is in UAPI, and since we don't want to break compilation
>>>>> of iproute2, It cannot be renamed anymore.
>>>>>
>>>>> Given that, we do not want to start trend of naming device
>>>>> attributes with
>>>> additional _VDPA_ to it as done in this patch.
>>>>> Error in commit [1] was exception.
>>>>>
>>>>> Hence, please reuse VDPA_ATTR_DEV_SUPPORTED_FEATURES to return
>>>> for device features too.
>>>>
>>>>
>>>> This will probably break or confuse the existing userspace?
>>>>
>>> It shouldn't break, because its new attribute on the device.
>>> All attributes are per command, so old one will not be confused either.
>> A netlink attr should has its own and unique purpose, that's why we don't need
>> locks for the attrs, only one consumer and only one producer.
>> I am afraid re-using (for both management device and the vDPA device) the attr
>> VDPA_ATTR_DEV_SUPPORTED_FEATURES would lead to new race condition.
>> E.g., There are possibilities of querying FEATURES of a management device and
>> a vDPA device simultaneously, or can there be a syncing issue in a tick?
> Both can be queried simultaneously. Each will return their own feature bits using same attribute.
> It wont lead to the race.
Agreed. Multiple userspace callers would do recv() calls on different 
netlink sockets. Looks to me shouldn't involve any race.
>
>> IMHO, I don't see any advantages of re-using this attr.
> We don’t want to continue this mess of VDPA_DEV prefix for new attributes due to previous wrong naming.
Well, you can say it's a mess but since the attr name can be reused for 
different command,  I didn't care that much while reviewing this. 
Actually, it was initially named this way to show the device features in 
"vdpa dev config ..." output, but later on it had been moved to mgmtdev 
to show parent's capability.

-Siwei
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  7:50                               ` Si-Wei Liu
@ 2022-07-27  9:01                                 ` Michael S. Tsirkin
  2022-07-27 10:09                                   ` Si-Wei Liu
  0 siblings, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-27  9:01 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: Parav Pandit, netdev, virtualization, xieyongji, gautam.dawar,
	Zhu, Lingshan

On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
> 
> 
> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
> > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > Sent: Tuesday, July 26, 2022 10:53 PM
> > > > 
> > > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > Sent: Tuesday, July 26, 2022 10:15 PM
> > > > > > 
> > > > > > On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > > > Sent: Tuesday, July 12, 2022 11:46 PM
> > > > > > > > > When the user space which invokes netlink commands, detects that
> > > > > > _MQ
> > > > > > > > is not supported, hence it takes max_queue_pair = 1 by itself.
> > > > > > > > I think the kernel module have all necessary information and it is
> > > > > > > > the only one which have precise information of a device, so it
> > > > > > > > should answer precisely than let the user space guess. The kernel
> > > > > > > > module should be reliable than stay silent, leave the question to
> > > > > > > > the user space
> > > > > > tool.
> > > > > > > Kernel is reliable. It doesn’t expose a config space field if the
> > > > > > > field doesn’t
> > > > > > exist regardless of field should have default or no default.
> > > > > > so when you know it is one queue pair, you should answer one, not try
> > > > > > to guess.
> > > > > > > User space should not guess either. User space gets to see if _MQ
> > > > > > present/not present. If _MQ present than get reliable data from kernel.
> > > > > > > If _MQ not present, it means this device has one VQ pair.
> > > > > > it is still a guess, right? And all user space tools implemented this
> > > > > > feature need to guess
> > > > > No. it is not a guess.
> > > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > > The code you proposed will be present in the user space.
> > > > > It will be uniform for _MQ and 10 other features that are present now and
> > > > in the future.
> > > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > > are no attributes like max_rss_key_size, and there is not a default value.
> > > > But for MQ, we know it has to be 1 wihtout _MQ.
> > > "we" = user space.
> > > To keep the consistency among all the config space fields.
> > Actually I looked and the code some more and I'm puzzled:
> > 
> > 
> > 	struct virtio_net_config config = {};
> > 	u64 features;
> > 	u16 val_u16;
> > 
> > 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > 
> > 	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > 		    config.mac))
> > 		return -EMSGSIZE;
> > 
> > 
> > Mac returned even without VIRTIO_NET_F_MAC
> > 
> > 
> > 	val_u16 = le16_to_cpu(config.status);
> > 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > 		return -EMSGSIZE;
> > 
> > 
> > status returned even without VIRTIO_NET_F_STATUS
> > 
> > 	val_u16 = le16_to_cpu(config.mtu);
> > 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > 		return -EMSGSIZE;
> > 
> > 
> > MTU returned even without VIRTIO_NET_F_MTU
> > 
> > 
> > What's going on here?
> > 
> > 
> I guess this is spec thing (historical debt), I vaguely recall these fields
> are always present in config space regardless the existence of corresponding
> feature bit.
> 
> -Siwei

Nope:

2.5.1  Driver Requirements: Device Configuration Space

...

For optional configuration space fields, the driver MUST check that the corresponding feature is offered
before accessing that part of the configuration space.


-- 
MST


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  6:54                               ` Jason Wang
@ 2022-07-27  9:02                                 ` Michael S. Tsirkin
  2022-07-27  9:50                                   ` Jason Wang
  0 siblings, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-27  9:02 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, Zhu, Lingshan, virtualization, netdev, xieyongji,
	gautam.dawar

On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > >
> > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > Sent: Tuesday, July 26, 2022 10:53 PM
> > > >
> > > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > > >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > >> Sent: Tuesday, July 26, 2022 10:15 PM
> > > > >>
> > > > >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > > >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > > > >>>>> When the user space which invokes netlink commands, detects that
> > > > >> _MQ
> > > > >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > > > >>>> I think the kernel module have all necessary information and it is
> > > > >>>> the only one which have precise information of a device, so it
> > > > >>>> should answer precisely than let the user space guess. The kernel
> > > > >>>> module should be reliable than stay silent, leave the question to
> > > > >>>> the user space
> > > > >> tool.
> > > > >>> Kernel is reliable. It doesn’t expose a config space field if the
> > > > >>> field doesn’t
> > > > >> exist regardless of field should have default or no default.
> > > > >> so when you know it is one queue pair, you should answer one, not try
> > > > >> to guess.
> > > > >>> User space should not guess either. User space gets to see if _MQ
> > > > >> present/not present. If _MQ present than get reliable data from kernel.
> > > > >>> If _MQ not present, it means this device has one VQ pair.
> > > > >> it is still a guess, right? And all user space tools implemented this
> > > > >> feature need to guess
> > > > > No. it is not a guess.
> > > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > > The code you proposed will be present in the user space.
> > > > > It will be uniform for _MQ and 10 other features that are present now and
> > > > in the future.
> > > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > > are no attributes like max_rss_key_size, and there is not a default value.
> > > > But for MQ, we know it has to be 1 wihtout _MQ.
> > > "we" = user space.
> > > To keep the consistency among all the config space fields.
> >
> > Actually I looked and the code some more and I'm puzzled:
> >
> >
> >         struct virtio_net_config config = {};
> >         u64 features;
> >         u16 val_u16;
> >
> >         vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> >
> >         if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> >                     config.mac))
> >                 return -EMSGSIZE;
> >
> >
> > Mac returned even without VIRTIO_NET_F_MAC
> >
> >
> >         val_u16 = le16_to_cpu(config.status);
> >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> >                 return -EMSGSIZE;
> >
> >
> > status returned even without VIRTIO_NET_F_STATUS
> >
> >         val_u16 = le16_to_cpu(config.mtu);
> >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> >                 return -EMSGSIZE;
> >
> >
> > MTU returned even without VIRTIO_NET_F_MTU
> >
> >
> > What's going on here?
> 
> Probably too late to fix, but this should be fine as long as all
> parents support STATUS/MTU/MAC.

Why is this too late to fix.

> I wonder if we can add a check in the core and fail the device
> registration in this case.
> 
> Thanks
> 
> >
> >
> > --
> > MST
> >


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  6:56                                 ` Jason Wang
@ 2022-07-27  9:05                                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-27  9:05 UTC (permalink / raw)
  To: Jason Wang
  Cc: Zhu, Lingshan, Parav Pandit, virtualization, netdev, xieyongji,
	gautam.dawar

On Wed, Jul 27, 2022 at 02:56:20PM +0800, Jason Wang wrote:
> On Wed, Jul 27, 2022 at 2:26 PM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
> >
> >
> >
> > On 7/27/2022 2:01 PM, Michael S. Tsirkin wrote:
> > > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > >>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>> Sent: Tuesday, July 26, 2022 10:53 PM
> > >>>
> > >>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > >>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>>>> Sent: Tuesday, July 26, 2022 10:15 PM
> > >>>>>
> > >>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > >>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > >>>>>>>> When the user space which invokes netlink commands, detects that
> > >>>>> _MQ
> > >>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > >>>>>>> I think the kernel module have all necessary information and it is
> > >>>>>>> the only one which have precise information of a device, so it
> > >>>>>>> should answer precisely than let the user space guess. The kernel
> > >>>>>>> module should be reliable than stay silent, leave the question to
> > >>>>>>> the user space
> > >>>>> tool.
> > >>>>>> Kernel is reliable. It doesn’t expose a config space field if the
> > >>>>>> field doesn’t
> > >>>>> exist regardless of field should have default or no default.
> > >>>>> so when you know it is one queue pair, you should answer one, not try
> > >>>>> to guess.
> > >>>>>> User space should not guess either. User space gets to see if _MQ
> > >>>>> present/not present. If _MQ present than get reliable data from kernel.
> > >>>>>> If _MQ not present, it means this device has one VQ pair.
> > >>>>> it is still a guess, right? And all user space tools implemented this
> > >>>>> feature need to guess
> > >>>> No. it is not a guess.
> > >>>> It is explicitly checking the _MQ feature and deriving the value.
> > >>>> The code you proposed will be present in the user space.
> > >>>> It will be uniform for _MQ and 10 other features that are present now and
> > >>> in the future.
> > >>> MQ and other features like RSS are different. If there is no _RSS_XX, there
> > >>> are no attributes like max_rss_key_size, and there is not a default value.
> > >>> But for MQ, we know it has to be 1 wihtout _MQ.
> > >> "we" = user space.
> > >> To keep the consistency among all the config space fields.
> > > Actually I looked and the code some more and I'm puzzled:
> > I can submit a fix in my next version patch for these issue.
> > >
> > >
> > >       struct virtio_net_config config = {};
> > >       u64 features;
> > >       u16 val_u16;
> > >
> > >       vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > >
> > >       if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > >                   config.mac))
> > >               return -EMSGSIZE;
> > >
> > >
> > > Mac returned even without VIRTIO_NET_F_MAC
> > if no VIRTIO_NET_F_MAC, we should not nla_put
> > VDPA_ATTR_DEV_NET_CFG_MAC_ADDR, the spec says the driver should generate
> > a random mac.
> 
> It's probably too late to do this.

Not sure why.

> Most of the parents have this
> feature support, so probably not a real issue.

I guess not reporting MTU is not worse than failing initialization.

> > >
> > >
> > >       val_u16 = le16_to_cpu(config.status);
> > >       if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > >               return -EMSGSIZE;
> > >
> > >
> > > status returned even without VIRTIO_NET_F_STATUS
> > if no VIRTIO_NET_F_STATUS, we should not nla_put
> > VDPA_ATTR_DEV_NET_STATUS, the spec says the driver should assume the
> > link is active.
> 
> Somehow similar to F_MAC. But we can report if F_MAC is not negotiated.
> 
> 
> > >
> > >       val_u16 = le16_to_cpu(config.mtu);
> > >       if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > >               return -EMSGSIZE;
> > >
> > >
> > > MTU returned even without VIRTIO_NET_F_MTU
> > same as above, the spec says config.mtu depends on VIRTIO_NET_F_MTU, so
> > without this feature bit, we should not return MTU to the userspace.
> 
> Not a big issue, we just need to make sure the parent can report a
> correct MTU here.
> 
> Thanks
> 
> >
> > Does these fix look good to you?
> >
> > And I think we may need your adjudication for the two issues:
> > (1) Shall we answer max_vq_paris = 1 when _MQ not exist, I know you have
> > agreed on this in a previous thread, its nice to clarify
> > (2) I think we should not re-use the netlink attr to report feature bits
> > of both the management device and the vDPA device,
> > this can lead to a new race condition, there are no locks(especially
> > distributed locks for kernel_space and user_space) in the nla_put
> > functions. Re-using the attr is some kind of breaking the netlink
> > lockless design.
> >
> > Thanks,
> > Zhu Lingshan
> > >
> > >
> > > What's going on here?
> > >
> > >
> >


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  9:02                                 ` Michael S. Tsirkin
@ 2022-07-27  9:50                                   ` Jason Wang
  2022-07-27 15:45                                     ` Michael S. Tsirkin
  0 siblings, 1 reply; 113+ messages in thread
From: Jason Wang @ 2022-07-27  9:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, Zhu, Lingshan, virtualization, netdev, xieyongji,
	gautam.dawar

On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> > On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > > >
> > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > Sent: Tuesday, July 26, 2022 10:53 PM
> > > > >
> > > > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > > > >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > >> Sent: Tuesday, July 26, 2022 10:15 PM
> > > > > >>
> > > > > >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > > > >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > > > > >>>>> When the user space which invokes netlink commands, detects that
> > > > > >> _MQ
> > > > > >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > > > > >>>> I think the kernel module have all necessary information and it is
> > > > > >>>> the only one which have precise information of a device, so it
> > > > > >>>> should answer precisely than let the user space guess. The kernel
> > > > > >>>> module should be reliable than stay silent, leave the question to
> > > > > >>>> the user space
> > > > > >> tool.
> > > > > >>> Kernel is reliable. It doesn’t expose a config space field if the
> > > > > >>> field doesn’t
> > > > > >> exist regardless of field should have default or no default.
> > > > > >> so when you know it is one queue pair, you should answer one, not try
> > > > > >> to guess.
> > > > > >>> User space should not guess either. User space gets to see if _MQ
> > > > > >> present/not present. If _MQ present than get reliable data from kernel.
> > > > > >>> If _MQ not present, it means this device has one VQ pair.
> > > > > >> it is still a guess, right? And all user space tools implemented this
> > > > > >> feature need to guess
> > > > > > No. it is not a guess.
> > > > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > > > The code you proposed will be present in the user space.
> > > > > > It will be uniform for _MQ and 10 other features that are present now and
> > > > > in the future.
> > > > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > > > are no attributes like max_rss_key_size, and there is not a default value.
> > > > > But for MQ, we know it has to be 1 wihtout _MQ.
> > > > "we" = user space.
> > > > To keep the consistency among all the config space fields.
> > >
> > > Actually I looked and the code some more and I'm puzzled:
> > >
> > >
> > >         struct virtio_net_config config = {};
> > >         u64 features;
> > >         u16 val_u16;
> > >
> > >         vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > >
> > >         if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > >                     config.mac))
> > >                 return -EMSGSIZE;
> > >
> > >
> > > Mac returned even without VIRTIO_NET_F_MAC
> > >
> > >
> > >         val_u16 = le16_to_cpu(config.status);
> > >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > >                 return -EMSGSIZE;
> > >
> > >
> > > status returned even without VIRTIO_NET_F_STATUS
> > >
> > >         val_u16 = le16_to_cpu(config.mtu);
> > >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > >                 return -EMSGSIZE;
> > >
> > >
> > > MTU returned even without VIRTIO_NET_F_MTU
> > >
> > >
> > > What's going on here?
> >
> > Probably too late to fix, but this should be fine as long as all
> > parents support STATUS/MTU/MAC.
>
> Why is this too late to fix.

If we make this conditional on the features. This may break the
userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?

Thanks

>
> > I wonder if we can add a check in the core and fail the device
> > registration in this case.
> >
> > Thanks
> >
> > >
> > >
> > > --
> > > MST
> > >
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  9:01                                 ` Michael S. Tsirkin
@ 2022-07-27 10:09                                   ` Si-Wei Liu
  2022-07-27 11:54                                     ` Zhu, Lingshan
  2022-07-27 15:48                                     ` Michael S. Tsirkin
  0 siblings, 2 replies; 113+ messages in thread
From: Si-Wei Liu @ 2022-07-27 10:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, netdev, virtualization, xieyongji, gautam.dawar,
	Zhu, Lingshan



On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
> On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
>>
>> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>
>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>
>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>> When the user space which invokes netlink commands, detects that
>>>>>>> _MQ
>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>>>> I think the kernel module have all necessary information and it is
>>>>>>>>> the only one which have precise information of a device, so it
>>>>>>>>> should answer precisely than let the user space guess. The kernel
>>>>>>>>> module should be reliable than stay silent, leave the question to
>>>>>>>>> the user space
>>>>>>> tool.
>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>>>>> field doesn’t
>>>>>>> exist regardless of field should have default or no default.
>>>>>>> so when you know it is one queue pair, you should answer one, not try
>>>>>>> to guess.
>>>>>>>> User space should not guess either. User space gets to see if _MQ
>>>>>>> present/not present. If _MQ present than get reliable data from kernel.
>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>> it is still a guess, right? And all user space tools implemented this
>>>>>>> feature need to guess
>>>>>> No. it is not a guess.
>>>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>>>> The code you proposed will be present in the user space.
>>>>>> It will be uniform for _MQ and 10 other features that are present now and
>>>>> in the future.
>>>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
>>>>> are no attributes like max_rss_key_size, and there is not a default value.
>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>> "we" = user space.
>>>> To keep the consistency among all the config space fields.
>>> Actually I looked and the code some more and I'm puzzled:
>>>
>>>
>>> 	struct virtio_net_config config = {};
>>> 	u64 features;
>>> 	u16 val_u16;
>>>
>>> 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>
>>> 	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
>>> 		    config.mac))
>>> 		return -EMSGSIZE;
>>>
>>>
>>> Mac returned even without VIRTIO_NET_F_MAC
>>>
>>>
>>> 	val_u16 = le16_to_cpu(config.status);
>>> 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>> 		return -EMSGSIZE;
>>>
>>>
>>> status returned even without VIRTIO_NET_F_STATUS
>>>
>>> 	val_u16 = le16_to_cpu(config.mtu);
>>> 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>> 		return -EMSGSIZE;
>>>
>>>
>>> MTU returned even without VIRTIO_NET_F_MTU
>>>
>>>
>>> What's going on here?
>>>
>>>
>> I guess this is spec thing (historical debt), I vaguely recall these fields
>> are always present in config space regardless the existence of corresponding
>> feature bit.
>>
>> -Siwei
> Nope:
>
> 2.5.1  Driver Requirements: Device Configuration Space
>
> ...
>
> For optional configuration space fields, the driver MUST check that the corresponding feature is offered
> before accessing that part of the configuration space.
Well, this is driver side of requirement. As this interface is for host 
admin tool to query or configure vdpa device, we don't have to wait 
until feature negotiation is done on guest driver to extract vdpa 
attributes/parameters, say if we want to replicate another vdpa device 
with the same config on migration destination. I think what may need to 
be fix is to move off from using .vdpa_get_config_unlocked() which 
depends on feature negotiation. And/or expose config space register 
values through another set of attributes.

-Siwei





^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-07-27  8:15             ` Si-Wei Liu
@ 2022-07-27 11:38               ` Zhu, Lingshan
  0 siblings, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-27 11:38 UTC (permalink / raw)
  To: Si-Wei Liu, Parav Pandit, Jason Wang, mst
  Cc: netdev, xieyongji, gautam.dawar, virtualization



On 7/27/2022 4:15 PM, Si-Wei Liu wrote:
>
>
> On 7/5/2022 4:56 AM, Parav Pandit via Virtualization wrote:
>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>> Sent: Tuesday, July 5, 2022 3:59 AM
>>>
>>>
>>> On 7/4/2022 8:53 PM, Parav Pandit wrote:
>>>>> From: Jason Wang <jasowang@redhat.com>
>>>>> Sent: Monday, July 4, 2022 12:47 AM
>>>>>
>>>>>
>>>>> 在 2022/7/2 06:02, Parav Pandit 写道:
>>>>>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>
>>>>>>> This commit adds a new vDPA netlink attribution
>>>>>>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
>>>>> features
>>>>>>> of vDPA devices through this new attr.
>>>>>>>
>>>>>>> Fixes: a64917bc2e9b vdpa: (Provide interface to read driver
>>>>>>> feature)
>>>>>> Missing the "" in the line.
>>>>>> I reviewed the patches again.
>>>>>>
>>>>>> However, this is not the fix.
>>>>>> A fix cannot add a new UAPI.
>>>>>>
>>>>>> Code is already considering negotiated driver features to return the
>>>>>> device
>>>>> config space.
>>>>>> Hence it is fine.
>>>>>>
>>>>>> This patch intents to provide device features to user space.
>>>>>> First what vdpa device are capable of, are already returned by
>>>>>> features
>>>>> attribute on the management device.
>>>>>> This is done in commit [1].
>>>>>>
>>>>>> The only reason to have it is, when one management device indicates
>>>>>> that
>>>>> feature is supported, but device may end up not supporting this
>>>>> feature if such feature is shared with other devices on same 
>>>>> physical device.
>>>>>> For example all VFs may not be symmetric after large number of them
>>>>>> are
>>>>> in use. In such case features bit of management device can differ
>>>>> (more
>>>>> features) than the vdpa device of this VF.
>>>>>> Hence, showing on the device is useful.
>>>>>>
>>>>>> As mentioned before in V2, commit [1] has wrongly named the
>>>>>> attribute to
>>>>> VDPA_ATTR_DEV_SUPPORTED_FEATURES.
>>>>>> It should have been,
>>>>> VDPA_ATTR_DEV_MGMTDEV_SUPPORTED_FEATURES.
>>>>>> Because it is in UAPI, and since we don't want to break compilation
>>>>>> of iproute2, It cannot be renamed anymore.
>>>>>>
>>>>>> Given that, we do not want to start trend of naming device
>>>>>> attributes with
>>>>> additional _VDPA_ to it as done in this patch.
>>>>>> Error in commit [1] was exception.
>>>>>>
>>>>>> Hence, please reuse VDPA_ATTR_DEV_SUPPORTED_FEATURES to return
>>>>> for device features too.
>>>>>
>>>>>
>>>>> This will probably break or confuse the existing userspace?
>>>>>
>>>> It shouldn't break, because its new attribute on the device.
>>>> All attributes are per command, so old one will not be confused 
>>>> either.
>>> A netlink attr should has its own and unique purpose, that's why we 
>>> don't need
>>> locks for the attrs, only one consumer and only one producer.
>>> I am afraid re-using (for both management device and the vDPA 
>>> device) the attr
>>> VDPA_ATTR_DEV_SUPPORTED_FEATURES would lead to new race condition.
>>> E.g., There are possibilities of querying FEATURES of a management 
>>> device and
>>> a vDPA device simultaneously, or can there be a syncing issue in a 
>>> tick?
>> Both can be queried simultaneously. Each will return their own 
>> feature bits using same attribute.
>> It wont lead to the race.
> Agreed. Multiple userspace callers would do recv() calls on different 
> netlink sockets. Looks to me shouldn't involve any race.
oh yes, thanks for pointing this out, they are on different sockets 
belonging to different userspace programs.
>>
>>> IMHO, I don't see any advantages of re-using this attr.
>> We don’t want to continue this mess of VDPA_DEV prefix for new 
>> attributes due to previous wrong naming.
> Well, you can say it's a mess but since the attr name can be reused 
> for different command,  I didn't care that much while reviewing this. 
> Actually, it was initially named this way to show the device features 
> in "vdpa dev config ..." output, but later on it had been moved to 
> mgmtdev to show parent's capability.
yes there is a buggy commit, but we can not change it now, because we 
are not expected to break current uapi, so I think it is better to add a 
new attr, no benefits to reuse another attr.
>
> -Siwei
>> _______________________________________________
>> Virtualization mailing list
>> Virtualization@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27 10:09                                   ` Si-Wei Liu
@ 2022-07-27 11:54                                     ` Zhu, Lingshan
  2022-07-28  1:41                                       ` Si-Wei Liu
  2022-07-27 15:48                                     ` Michael S. Tsirkin
  1 sibling, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-27 11:54 UTC (permalink / raw)
  To: Si-Wei Liu, Michael S. Tsirkin
  Cc: Parav Pandit, netdev, virtualization, xieyongji, gautam.dawar



On 7/27/2022 6:09 PM, Si-Wei Liu wrote:
>
>
> On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
>> On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
>>>
>>> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>
>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>
>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>> When the user space which invokes netlink commands, detects 
>>>>>>>>>>> that
>>>>>>>> _MQ
>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>>>>> I think the kernel module have all necessary information and 
>>>>>>>>>> it is
>>>>>>>>>> the only one which have precise information of a device, so it
>>>>>>>>>> should answer precisely than let the user space guess. The 
>>>>>>>>>> kernel
>>>>>>>>>> module should be reliable than stay silent, leave the 
>>>>>>>>>> question to
>>>>>>>>>> the user space
>>>>>>>> tool.
>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>>>>>> field doesn’t
>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>> so when you know it is one queue pair, you should answer one, 
>>>>>>>> not try
>>>>>>>> to guess.
>>>>>>>>> User space should not guess either. User space gets to see if _MQ
>>>>>>>> present/not present. If _MQ present than get reliable data from 
>>>>>>>> kernel.
>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>> it is still a guess, right? And all user space tools 
>>>>>>>> implemented this
>>>>>>>> feature need to guess
>>>>>>> No. it is not a guess.
>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>>>>> The code you proposed will be present in the user space.
>>>>>>> It will be uniform for _MQ and 10 other features that are 
>>>>>>> present now and
>>>>>> in the future.
>>>>>> MQ and other features like RSS are different. If there is no 
>>>>>> _RSS_XX, there
>>>>>> are no attributes like max_rss_key_size, and there is not a 
>>>>>> default value.
>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>> "we" = user space.
>>>>> To keep the consistency among all the config space fields.
>>>> Actually I looked and the code some more and I'm puzzled:
>>>>
>>>>
>>>>     struct virtio_net_config config = {};
>>>>     u64 features;
>>>>     u16 val_u16;
>>>>
>>>>     vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>
>>>>     if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, 
>>>> sizeof(config.mac),
>>>>             config.mac))
>>>>         return -EMSGSIZE;
>>>>
>>>>
>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>
>>>>
>>>>     val_u16 = le16_to_cpu(config.status);
>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>         return -EMSGSIZE;
>>>>
>>>>
>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>
>>>>     val_u16 = le16_to_cpu(config.mtu);
>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>         return -EMSGSIZE;
>>>>
>>>>
>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>
>>>>
>>>> What's going on here?
>>>>
>>>>
>>> I guess this is spec thing (historical debt), I vaguely recall these 
>>> fields
>>> are always present in config space regardless the existence of 
>>> corresponding
>>> feature bit.
>>>
>>> -Siwei
>> Nope:
>>
>> 2.5.1  Driver Requirements: Device Configuration Space
>>
>> ...
>>
>> For optional configuration space fields, the driver MUST check that 
>> the corresponding feature is offered
>> before accessing that part of the configuration space.
> Well, this is driver side of requirement. As this interface is for 
> host admin tool to query or configure vdpa device, we don't have to 
> wait until feature negotiation is done on guest driver to extract vdpa 
> attributes/parameters, say if we want to replicate another vdpa device 
> with the same config on migration destination. I think what may need 
> to be fix is to move off from using .vdpa_get_config_unlocked() which 
> depends on feature negotiation. And/or expose config space register 
> values through another set of attributes.
Yes, we don't have to wait for FEATURES_OK. In another patch in this 
series, I have added a new netlink attr to report the device features, 
and removed the blocker. So the LM orchestration SW can query the device 
features of the devices at the destination cluster, and pick a proper 
one, even mask out some features to meet the LM requirements.

Thanks,
Zhu Lingshan
> -Siwei
>
>
>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27  9:50                                   ` Jason Wang
@ 2022-07-27 15:45                                     ` Michael S. Tsirkin
  2022-07-28  1:21                                       ` Jason Wang
  0 siblings, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-27 15:45 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, Zhu, Lingshan, virtualization, netdev, xieyongji,
	gautam.dawar

On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
> On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> > > On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > > > >
> > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > Sent: Tuesday, July 26, 2022 10:53 PM
> > > > > >
> > > > > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > > > > >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > >> Sent: Tuesday, July 26, 2022 10:15 PM
> > > > > > >>
> > > > > > >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > > > > >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > > > > > >>>>> When the user space which invokes netlink commands, detects that
> > > > > > >> _MQ
> > > > > > >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > > > > > >>>> I think the kernel module have all necessary information and it is
> > > > > > >>>> the only one which have precise information of a device, so it
> > > > > > >>>> should answer precisely than let the user space guess. The kernel
> > > > > > >>>> module should be reliable than stay silent, leave the question to
> > > > > > >>>> the user space
> > > > > > >> tool.
> > > > > > >>> Kernel is reliable. It doesn’t expose a config space field if the
> > > > > > >>> field doesn’t
> > > > > > >> exist regardless of field should have default or no default.
> > > > > > >> so when you know it is one queue pair, you should answer one, not try
> > > > > > >> to guess.
> > > > > > >>> User space should not guess either. User space gets to see if _MQ
> > > > > > >> present/not present. If _MQ present than get reliable data from kernel.
> > > > > > >>> If _MQ not present, it means this device has one VQ pair.
> > > > > > >> it is still a guess, right? And all user space tools implemented this
> > > > > > >> feature need to guess
> > > > > > > No. it is not a guess.
> > > > > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > > > > The code you proposed will be present in the user space.
> > > > > > > It will be uniform for _MQ and 10 other features that are present now and
> > > > > > in the future.
> > > > > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > > > > are no attributes like max_rss_key_size, and there is not a default value.
> > > > > > But for MQ, we know it has to be 1 wihtout _MQ.
> > > > > "we" = user space.
> > > > > To keep the consistency among all the config space fields.
> > > >
> > > > Actually I looked and the code some more and I'm puzzled:
> > > >
> > > >
> > > >         struct virtio_net_config config = {};
> > > >         u64 features;
> > > >         u16 val_u16;
> > > >
> > > >         vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > > >
> > > >         if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > > >                     config.mac))
> > > >                 return -EMSGSIZE;
> > > >
> > > >
> > > > Mac returned even without VIRTIO_NET_F_MAC
> > > >
> > > >
> > > >         val_u16 = le16_to_cpu(config.status);
> > > >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > > >                 return -EMSGSIZE;
> > > >
> > > >
> > > > status returned even without VIRTIO_NET_F_STATUS
> > > >
> > > >         val_u16 = le16_to_cpu(config.mtu);
> > > >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > > >                 return -EMSGSIZE;
> > > >
> > > >
> > > > MTU returned even without VIRTIO_NET_F_MTU
> > > >
> > > >
> > > > What's going on here?
> > >
> > > Probably too late to fix, but this should be fine as long as all
> > > parents support STATUS/MTU/MAC.
> >
> > Why is this too late to fix.
> 
> If we make this conditional on the features. This may break the
> userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
> 
> Thanks

Well only on devices without MTU. I'm saying said userspace
was reading trash on such devices anyway.
We don't generally maintain bug for bug compatiblity on a whim,
only if userspace is actually known to break if we fix a bug.


> >
> > > I wonder if we can add a check in the core and fail the device
> > > registration in this case.
> > >
> > > Thanks
> > >
> > > >
> > > >
> > > > --
> > > > MST
> > > >
> >


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27 10:09                                   ` Si-Wei Liu
  2022-07-27 11:54                                     ` Zhu, Lingshan
@ 2022-07-27 15:48                                     ` Michael S. Tsirkin
  1 sibling, 0 replies; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-27 15:48 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: Parav Pandit, netdev, virtualization, xieyongji, gautam.dawar,
	Zhu, Lingshan

On Wed, Jul 27, 2022 at 03:09:43AM -0700, Si-Wei Liu wrote:
> 
> 
> On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
> > On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
> > > 
> > > On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
> > > > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > Sent: Tuesday, July 26, 2022 10:53 PM
> > > > > > 
> > > > > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > > > Sent: Tuesday, July 26, 2022 10:15 PM
> > > > > > > > 
> > > > > > > > On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > > > > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > > > > > Sent: Tuesday, July 12, 2022 11:46 PM
> > > > > > > > > > > When the user space which invokes netlink commands, detects that
> > > > > > > > _MQ
> > > > > > > > > > is not supported, hence it takes max_queue_pair = 1 by itself.
> > > > > > > > > > I think the kernel module have all necessary information and it is
> > > > > > > > > > the only one which have precise information of a device, so it
> > > > > > > > > > should answer precisely than let the user space guess. The kernel
> > > > > > > > > > module should be reliable than stay silent, leave the question to
> > > > > > > > > > the user space
> > > > > > > > tool.
> > > > > > > > > Kernel is reliable. It doesn’t expose a config space field if the
> > > > > > > > > field doesn’t
> > > > > > > > exist regardless of field should have default or no default.
> > > > > > > > so when you know it is one queue pair, you should answer one, not try
> > > > > > > > to guess.
> > > > > > > > > User space should not guess either. User space gets to see if _MQ
> > > > > > > > present/not present. If _MQ present than get reliable data from kernel.
> > > > > > > > > If _MQ not present, it means this device has one VQ pair.
> > > > > > > > it is still a guess, right? And all user space tools implemented this
> > > > > > > > feature need to guess
> > > > > > > No. it is not a guess.
> > > > > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > > > > The code you proposed will be present in the user space.
> > > > > > > It will be uniform for _MQ and 10 other features that are present now and
> > > > > > in the future.
> > > > > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > > > > are no attributes like max_rss_key_size, and there is not a default value.
> > > > > > But for MQ, we know it has to be 1 wihtout _MQ.
> > > > > "we" = user space.
> > > > > To keep the consistency among all the config space fields.
> > > > Actually I looked and the code some more and I'm puzzled:
> > > > 
> > > > 
> > > > 	struct virtio_net_config config = {};
> > > > 	u64 features;
> > > > 	u16 val_u16;
> > > > 
> > > > 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > > > 
> > > > 	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > > > 		    config.mac))
> > > > 		return -EMSGSIZE;
> > > > 
> > > > 
> > > > Mac returned even without VIRTIO_NET_F_MAC
> > > > 
> > > > 
> > > > 	val_u16 = le16_to_cpu(config.status);
> > > > 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > > > 		return -EMSGSIZE;
> > > > 
> > > > 
> > > > status returned even without VIRTIO_NET_F_STATUS
> > > > 
> > > > 	val_u16 = le16_to_cpu(config.mtu);
> > > > 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > > > 		return -EMSGSIZE;
> > > > 
> > > > 
> > > > MTU returned even without VIRTIO_NET_F_MTU
> > > > 
> > > > 
> > > > What's going on here?
> > > > 
> > > > 
> > > I guess this is spec thing (historical debt), I vaguely recall these fields
> > > are always present in config space regardless the existence of corresponding
> > > feature bit.
> > > 
> > > -Siwei
> > Nope:
> > 
> > 2.5.1  Driver Requirements: Device Configuration Space
> > 
> > ...
> > 
> > For optional configuration space fields, the driver MUST check that the corresponding feature is offered
> > before accessing that part of the configuration space.
> Well, this is driver side of requirement.


Well driver and device are the only two entities in the spec.

> As this interface is for host
> admin tool to query or configure vdpa device, we don't have to wait until
> feature negotiation is done on guest driver to extract vdpa
> attributes/parameters, say if we want to replicate another vdpa device with
> the same config on migration destination. I think what may need to be fix is
> to move off from using .vdpa_get_config_unlocked() which depends on feature
> negotiation. And/or expose config space register values through another set
> of attributes.
> 
> -Siwei
> 
> 

Sounds like something that might use the proposed admin queue maybe.
Hope that makes progress ...


-- 
MST


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27 15:45                                     ` Michael S. Tsirkin
@ 2022-07-28  1:21                                       ` Jason Wang
  2022-07-28  3:46                                         ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Jason Wang @ 2022-07-28  1:21 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, Zhu, Lingshan, virtualization, netdev, xieyongji,
	gautam.dawar

On Wed, Jul 27, 2022 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
> > On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> > > > On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > > > > >
> > > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > > Sent: Tuesday, July 26, 2022 10:53 PM
> > > > > > >
> > > > > > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > > > > > >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > > >> Sent: Tuesday, July 26, 2022 10:15 PM
> > > > > > > >>
> > > > > > > >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > > > > > >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > > >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > > > > > > >>>>> When the user space which invokes netlink commands, detects that
> > > > > > > >> _MQ
> > > > > > > >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > > > > > > >>>> I think the kernel module have all necessary information and it is
> > > > > > > >>>> the only one which have precise information of a device, so it
> > > > > > > >>>> should answer precisely than let the user space guess. The kernel
> > > > > > > >>>> module should be reliable than stay silent, leave the question to
> > > > > > > >>>> the user space
> > > > > > > >> tool.
> > > > > > > >>> Kernel is reliable. It doesn’t expose a config space field if the
> > > > > > > >>> field doesn’t
> > > > > > > >> exist regardless of field should have default or no default.
> > > > > > > >> so when you know it is one queue pair, you should answer one, not try
> > > > > > > >> to guess.
> > > > > > > >>> User space should not guess either. User space gets to see if _MQ
> > > > > > > >> present/not present. If _MQ present than get reliable data from kernel.
> > > > > > > >>> If _MQ not present, it means this device has one VQ pair.
> > > > > > > >> it is still a guess, right? And all user space tools implemented this
> > > > > > > >> feature need to guess
> > > > > > > > No. it is not a guess.
> > > > > > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > > > > > The code you proposed will be present in the user space.
> > > > > > > > It will be uniform for _MQ and 10 other features that are present now and
> > > > > > > in the future.
> > > > > > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > > > > > are no attributes like max_rss_key_size, and there is not a default value.
> > > > > > > But for MQ, we know it has to be 1 wihtout _MQ.
> > > > > > "we" = user space.
> > > > > > To keep the consistency among all the config space fields.
> > > > >
> > > > > Actually I looked and the code some more and I'm puzzled:
> > > > >
> > > > >
> > > > >         struct virtio_net_config config = {};
> > > > >         u64 features;
> > > > >         u16 val_u16;
> > > > >
> > > > >         vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > > > >
> > > > >         if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > > > >                     config.mac))
> > > > >                 return -EMSGSIZE;
> > > > >
> > > > >
> > > > > Mac returned even without VIRTIO_NET_F_MAC
> > > > >
> > > > >
> > > > >         val_u16 = le16_to_cpu(config.status);
> > > > >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > > > >                 return -EMSGSIZE;
> > > > >
> > > > >
> > > > > status returned even without VIRTIO_NET_F_STATUS
> > > > >
> > > > >         val_u16 = le16_to_cpu(config.mtu);
> > > > >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > > > >                 return -EMSGSIZE;
> > > > >
> > > > >
> > > > > MTU returned even without VIRTIO_NET_F_MTU
> > > > >
> > > > >
> > > > > What's going on here?
> > > >
> > > > Probably too late to fix, but this should be fine as long as all
> > > > parents support STATUS/MTU/MAC.
> > >
> > > Why is this too late to fix.
> >
> > If we make this conditional on the features. This may break the
> > userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
> >
> > Thanks
>
> Well only on devices without MTU. I'm saying said userspace
> was reading trash on such devices anyway.

It depends on the parent actually. For example, mlx5 query the lower
mtu unconditionally:

        err = query_mtu(mdev, &mtu);
        if (err)
                goto err_alloc;

        ndev->config.mtu = cpu_to_mlx5vdpa16(mvdev, mtu);

Supporting MTU features seems to be a must for real hardware.
Otherwise the driver may not work correctly.

> We don't generally maintain bug for bug compatiblity on a whim,
> only if userspace is actually known to break if we fix a bug.

 So I think it should be fine to make this conditional then we should
have a consistent handling of other fields like MQ.

Thanks

>
>
> > >
> > > > I wonder if we can add a check in the core and fail the device
> > > > registration in this case.
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > >
> > > > > --
> > > > > MST
> > > > >
> > >
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-27 11:54                                     ` Zhu, Lingshan
@ 2022-07-28  1:41                                       ` Si-Wei Liu
  2022-07-28  2:44                                         ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Si-Wei Liu @ 2022-07-28  1:41 UTC (permalink / raw)
  To: Zhu, Lingshan, Michael S. Tsirkin
  Cc: Parav Pandit, netdev, virtualization, xieyongji, gautam.dawar



On 7/27/2022 4:54 AM, Zhu, Lingshan wrote:
>
>
> On 7/27/2022 6:09 PM, Si-Wei Liu wrote:
>>
>>
>> On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
>>> On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
>>>>
>>>> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>>
>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>>
>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>>> When the user space which invokes netlink commands, detects 
>>>>>>>>>>>> that
>>>>>>>>> _MQ
>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>>>>>> I think the kernel module have all necessary information and 
>>>>>>>>>>> it is
>>>>>>>>>>> the only one which have precise information of a device, so it
>>>>>>>>>>> should answer precisely than let the user space guess. The 
>>>>>>>>>>> kernel
>>>>>>>>>>> module should be reliable than stay silent, leave the 
>>>>>>>>>>> question to
>>>>>>>>>>> the user space
>>>>>>>>> tool.
>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if 
>>>>>>>>>> the
>>>>>>>>>> field doesn’t
>>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>>> so when you know it is one queue pair, you should answer one, 
>>>>>>>>> not try
>>>>>>>>> to guess.
>>>>>>>>>> User space should not guess either. User space gets to see if 
>>>>>>>>>> _MQ
>>>>>>>>> present/not present. If _MQ present than get reliable data 
>>>>>>>>> from kernel.
>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>>> it is still a guess, right? And all user space tools 
>>>>>>>>> implemented this
>>>>>>>>> feature need to guess
>>>>>>>> No. it is not a guess.
>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>>>>>> The code you proposed will be present in the user space.
>>>>>>>> It will be uniform for _MQ and 10 other features that are 
>>>>>>>> present now and
>>>>>>> in the future.
>>>>>>> MQ and other features like RSS are different. If there is no 
>>>>>>> _RSS_XX, there
>>>>>>> are no attributes like max_rss_key_size, and there is not a 
>>>>>>> default value.
>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>>> "we" = user space.
>>>>>> To keep the consistency among all the config space fields.
>>>>> Actually I looked and the code some more and I'm puzzled:
>>>>>
>>>>>
>>>>>     struct virtio_net_config config = {};
>>>>>     u64 features;
>>>>>     u16 val_u16;
>>>>>
>>>>>     vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>
>>>>>     if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, 
>>>>> sizeof(config.mac),
>>>>>             config.mac))
>>>>>         return -EMSGSIZE;
>>>>>
>>>>>
>>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>>
>>>>>
>>>>>     val_u16 = le16_to_cpu(config.status);
>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>         return -EMSGSIZE;
>>>>>
>>>>>
>>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>>
>>>>>     val_u16 = le16_to_cpu(config.mtu);
>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>         return -EMSGSIZE;
>>>>>
>>>>>
>>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>>
>>>>>
>>>>> What's going on here?
>>>>>
>>>>>
>>>> I guess this is spec thing (historical debt), I vaguely recall 
>>>> these fields
>>>> are always present in config space regardless the existence of 
>>>> corresponding
>>>> feature bit.
>>>>
>>>> -Siwei
>>> Nope:
>>>
>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>
>>> ...
>>>
>>> For optional configuration space fields, the driver MUST check that 
>>> the corresponding feature is offered
>>> before accessing that part of the configuration space.
>> Well, this is driver side of requirement. As this interface is for 
>> host admin tool to query or configure vdpa device, we don't have to 
>> wait until feature negotiation is done on guest driver to extract 
>> vdpa attributes/parameters, say if we want to replicate another vdpa 
>> device with the same config on migration destination. I think what 
>> may need to be fix is to move off from using 
>> .vdpa_get_config_unlocked() which depends on feature negotiation. 
>> And/or expose config space register values through another set of 
>> attributes.
> Yes, we don't have to wait for FEATURES_OK. In another patch in this 
> series, I have added a new netlink attr to report the device features, 
> and removed the blocker. So the LM orchestration SW can query the 
> device features of the devices at the destination cluster, and pick a 
> proper one, even mask out some features to meet the LM requirements.
For that end, you'd need to move off from using 
vdpa_get_config_unlocked() which depends on feature negotiation. Since 
this would slightly change the original semantics of each field that 
"vdpa dev config" shows, it probably need another netlink command and 
new uAPI.

-Siwei


>
> Thanks,
> Zhu Lingshan
>> -Siwei
>>
>>
>>
>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
       [not found]         ` <8002554a-a77c-7b25-8f99-8d68248a741d@oracle.com>
@ 2022-07-28  2:06           ` Jason Wang
  2022-07-28  7:08             ` Si-Wei Liu
       [not found]           ` <00e2e07e-1a2e-7af8-a060-cc9034e0d33f@intel.com>
  1 sibling, 1 reply; 113+ messages in thread
From: Jason Wang @ 2022-07-28  2:06 UTC (permalink / raw)
  To: Si-Wei Liu, Zhu, Lingshan, Parav Pandit, mst, Eli Cohen
  Cc: netdev, xieyongji, gautam.dawar, virtualization


在 2022/7/28 08:56, Si-Wei Liu 写道:
>
>
> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>
>>
>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>> Sorry to chime in late in the game. For some reason I couldn't get 
>>> to most emails for this discussion (I only subscribed to the 
>>> virtualization list), while I was taking off amongst the past few 
>>> weeks.
>>>
>>> It looks to me this patch is incomplete. Noted down the way in 
>>> vdpa_dev_net_config_fill(), we have the following:
>>>          features = vdev->config->get_driver_features(vdev);
>>>          if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>                                VDPA_ATTR_PAD))
>>>                  return -EMSGSIZE;
>>>
>>> Making call to .get_driver_features() doesn't make sense when 
>>> feature negotiation isn't complete. Neither should present 
>>> negotiated_features to userspace before negotiation is done.
>>>
>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() probably 
>>> should not show before negotiation is done - it depends on driver 
>>> features negotiated.
>> I have another patch in this series introduces device_features and 
>> will report device_features to the userspace even features 
>> negotiation not done. Because the spec says we should allow driver 
>> access the config space before FEATURES_OK.
> The config space can be accessed by guest before features_ok doesn't 
> necessarily mean the value is valid. 


It's valid as long as the device offers the feature:

"The device MUST allow reading of any device-specific configuration 
field before FEATURES_OK is set by the driver. This includes fields 
which are conditional on feature bits, as long as those feature bits are 
offered by the device."


> You may want to double check with Michael for what he quoted earlier:
>> Nope:
>>
>> 2.5.1  Driver Requirements: Device Configuration Space
>>
>> ...
>>
>> For optional configuration space fields, the driver MUST check that the corresponding feature is offered
>> before accessing that part of the configuration space.
>
> and how many driver bugs taking wrong assumption of the validity of 
> config space field without features_ok. I am not sure what use case 
> you want to expose config resister values for before features_ok, if 
> it's mostly for live migration I guess it's probably heading a wrong 
> direction.


I guess it's not for migration. For migration, a provision with the 
correct features/capability would be sufficient.

Thanks


>
>
>>>
>>>
>>> Last but not the least, this "vdpa dev config" command was not 
>>> designed to display the real config space register values in the 
>>> first place. Quoting the vdpa-dev(8) man page:
>>>
>>>> vdpa dev config show - Show configuration of specific device or all 
>>>> devices.
>>>> DEV - specifies the vdpa device to show its configuration. If this 
>>>> argument is omitted all devices configuration is listed.
>>> It doesn't say anything about configuration space or register values 
>>> in config space. As long as it can convey the config attribute when 
>>> instantiating vDPA device instance, and more importantly, the config 
>>> can be easily imported from or exported to userspace tools when 
>>> trying to reconstruct vdpa instance intact on destination host for 
>>> live migration, IMHO in my personal interpretation it doesn't matter 
>>> what the config space may present. It may be worth while adding a 
>>> new debug command to expose the real register value, but that's 
>>> another story.
>> I am not sure getting your points. vDPA now reports device feature 
>> bits(device_features) and negotiated feature bits(driver_features), 
>> and yes, the drivers features can be a subset of the device features; 
>> and the vDPA device features can be a subset of the management device 
>> features.
> What I said is after unblocking the conditional check, you'd have to 
> handle the case for each of the vdpa attribute when feature 
> negotiation is not yet done: basically the register values you got 
> from config space via the vdpa_get_config_unlocked() call is not 
> considered to be valid before features_ok (per-spec). Although in some 
> case you may get sane value, such behavior is generally undefined. If 
> you desire to show just the device_features alone without any config 
> space field, which the device had advertised *before feature 
> negotiation is complete*, that'll be fine. But looks to me this is not 
> how patch has been implemented. Probably need some more work?
>
> Regards,
> -Siwei
>
>>>
>>> Having said, please consider to drop the Fixes tag, as appears to me 
>>> you're proposing a new feature rather than fixing a real issue.
>> it's a new feature to report the device feature bits than only 
>> negotiated features, however this patch is a must, or it will block 
>> the device feature bits reporting. but I agree, the fix tag is not a 
>> must.
>>>
>>> Thanks,
>>> -Siwei
>>>
>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>
>>>>> Users may want to query the config space of a vDPA device, to choose a
>>>>> appropriate one for a certain guest. This means the users need to read the
>>>>> config space before FEATURES_OK, and the existence of config space
>>>>> contents does not depend on FEATURES_OK.
>>>>>
>>>>> The spec says:
>>>>> The device MUST allow reading of any device-specific configuration field
>>>>> before FEATURES_OK is set by the driver. This includes fields which are
>>>>> conditional on feature bits, as long as those feature bits are offered by the
>>>>> device.
>>>>>
>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
>>>> Fix is fine, but fixes tag needs correction described below.
>>>>
>>>> Above commit id is 13 letters should be 12.
>>>> And
>>>> It should be in format
>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if FEATURES_OK")
>>>>
>>>> Please use checkpatch.pl script before posting the patches to catch these errors.
>>>> There is a bot that looks at the fixes tag and identifies the right kernel version to apply this fix.
>>>>
>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>> ---
>>>>>   drivers/vdpa/vdpa.c | 8 --------
>>>>>   1 file changed, 8 deletions(-)
>>>>>
>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>> --- a/drivers/vdpa/vdpa.c
>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>   	u32 device_id;
>>>>>   	void *hdr;
>>>>> -	u8 status;
>>>>>   	int err;
>>>>>
>>>>>   	down_read(&vdev->cf_lock);
>>>>> -	status = vdev->config->get_status(vdev);
>>>>> -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>> -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>> completed");
>>>>> -		err = -EAGAIN;
>>>>> -		goto out;
>>>>> -	}
>>>>> -
>>>>>   	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>>>   			  VDPA_CMD_DEV_CONFIG_GET);
>>>>>   	if (!hdr) {
>>>>> --
>>>>> 2.31.1
>>>> _______________________________________________
>>>> Virtualization mailing list
>>>> Virtualization@lists.linux-foundation.org
>>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>>>
>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-28  1:41                                       ` Si-Wei Liu
@ 2022-07-28  2:44                                         ` Zhu, Lingshan
  2022-07-28 21:54                                           ` Si-Wei Liu
  0 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-28  2:44 UTC (permalink / raw)
  To: Si-Wei Liu, Michael S. Tsirkin
  Cc: Parav Pandit, netdev, virtualization, xieyongji, gautam.dawar



On 7/28/2022 9:41 AM, Si-Wei Liu wrote:
>
>
> On 7/27/2022 4:54 AM, Zhu, Lingshan wrote:
>>
>>
>> On 7/27/2022 6:09 PM, Si-Wei Liu wrote:
>>>
>>>
>>> On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
>>>> On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
>>>>>
>>>>> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
>>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>>>
>>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>>>
>>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>>>> When the user space which invokes netlink commands, 
>>>>>>>>>>>>> detects that
>>>>>>>>>> _MQ
>>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>>>>>>> I think the kernel module have all necessary information 
>>>>>>>>>>>> and it is
>>>>>>>>>>>> the only one which have precise information of a device, so it
>>>>>>>>>>>> should answer precisely than let the user space guess. The 
>>>>>>>>>>>> kernel
>>>>>>>>>>>> module should be reliable than stay silent, leave the 
>>>>>>>>>>>> question to
>>>>>>>>>>>> the user space
>>>>>>>>>> tool.
>>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field 
>>>>>>>>>>> if the
>>>>>>>>>>> field doesn’t
>>>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>>>> so when you know it is one queue pair, you should answer one, 
>>>>>>>>>> not try
>>>>>>>>>> to guess.
>>>>>>>>>>> User space should not guess either. User space gets to see 
>>>>>>>>>>> if _MQ
>>>>>>>>>> present/not present. If _MQ present than get reliable data 
>>>>>>>>>> from kernel.
>>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>>>> it is still a guess, right? And all user space tools 
>>>>>>>>>> implemented this
>>>>>>>>>> feature need to guess
>>>>>>>>> No. it is not a guess.
>>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>>>>>>> The code you proposed will be present in the user space.
>>>>>>>>> It will be uniform for _MQ and 10 other features that are 
>>>>>>>>> present now and
>>>>>>>> in the future.
>>>>>>>> MQ and other features like RSS are different. If there is no 
>>>>>>>> _RSS_XX, there
>>>>>>>> are no attributes like max_rss_key_size, and there is not a 
>>>>>>>> default value.
>>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>>>> "we" = user space.
>>>>>>> To keep the consistency among all the config space fields.
>>>>>> Actually I looked and the code some more and I'm puzzled:
>>>>>>
>>>>>>
>>>>>>     struct virtio_net_config config = {};
>>>>>>     u64 features;
>>>>>>     u16 val_u16;
>>>>>>
>>>>>>     vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>>
>>>>>>     if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, 
>>>>>> sizeof(config.mac),
>>>>>>             config.mac))
>>>>>>         return -EMSGSIZE;
>>>>>>
>>>>>>
>>>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>>>
>>>>>>
>>>>>>     val_u16 = le16_to_cpu(config.status);
>>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>         return -EMSGSIZE;
>>>>>>
>>>>>>
>>>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>>>
>>>>>>     val_u16 = le16_to_cpu(config.mtu);
>>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>         return -EMSGSIZE;
>>>>>>
>>>>>>
>>>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>>>
>>>>>>
>>>>>> What's going on here?
>>>>>>
>>>>>>
>>>>> I guess this is spec thing (historical debt), I vaguely recall 
>>>>> these fields
>>>>> are always present in config space regardless the existence of 
>>>>> corresponding
>>>>> feature bit.
>>>>>
>>>>> -Siwei
>>>> Nope:
>>>>
>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>
>>>> ...
>>>>
>>>> For optional configuration space fields, the driver MUST check that 
>>>> the corresponding feature is offered
>>>> before accessing that part of the configuration space.
>>> Well, this is driver side of requirement. As this interface is for 
>>> host admin tool to query or configure vdpa device, we don't have to 
>>> wait until feature negotiation is done on guest driver to extract 
>>> vdpa attributes/parameters, say if we want to replicate another vdpa 
>>> device with the same config on migration destination. I think what 
>>> may need to be fix is to move off from using 
>>> .vdpa_get_config_unlocked() which depends on feature negotiation. 
>>> And/or expose config space register values through another set of 
>>> attributes.
>> Yes, we don't have to wait for FEATURES_OK. In another patch in this 
>> series, I have added a new netlink attr to report the device 
>> features, and removed the blocker. So the LM orchestration SW can 
>> query the device features of the devices at the destination cluster, 
>> and pick a proper one, even mask out some features to meet the LM 
>> requirements.
> For that end, you'd need to move off from using 
> vdpa_get_config_unlocked() which depends on feature negotiation. Since 
> this would slightly change the original semantics of each field that 
> "vdpa dev config" shows, it probably need another netlink command and 
> new uAPI.
why not show both device_features and driver_features in "vdpa dev 
config show"?
>
> -Siwei
>
>
>>
>> Thanks,
>> Zhu Lingshan
>>> -Siwei
>>>
>>>
>>>
>>>
>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-28  1:21                                       ` Jason Wang
@ 2022-07-28  3:46                                         ` Zhu, Lingshan
  2022-07-28  5:53                                           ` Jason Wang
  0 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-28  3:46 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: Parav Pandit, virtualization, netdev, xieyongji, gautam.dawar



On 7/28/2022 9:21 AM, Jason Wang wrote:
> On Wed, Jul 27, 2022 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
>>> On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>> On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
>>>>> On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>>>
>>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>>>
>>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>>>> When the user space which invokes netlink commands, detects that
>>>>>>>>>> _MQ
>>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>>>>>>> I think the kernel module have all necessary information and it is
>>>>>>>>>>>> the only one which have precise information of a device, so it
>>>>>>>>>>>> should answer precisely than let the user space guess. The kernel
>>>>>>>>>>>> module should be reliable than stay silent, leave the question to
>>>>>>>>>>>> the user space
>>>>>>>>>> tool.
>>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>>>>>>>> field doesn’t
>>>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>>>> so when you know it is one queue pair, you should answer one, not try
>>>>>>>>>> to guess.
>>>>>>>>>>> User space should not guess either. User space gets to see if _MQ
>>>>>>>>>> present/not present. If _MQ present than get reliable data from kernel.
>>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>>>> it is still a guess, right? And all user space tools implemented this
>>>>>>>>>> feature need to guess
>>>>>>>>> No. it is not a guess.
>>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>>>>>>> The code you proposed will be present in the user space.
>>>>>>>>> It will be uniform for _MQ and 10 other features that are present now and
>>>>>>>> in the future.
>>>>>>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
>>>>>>>> are no attributes like max_rss_key_size, and there is not a default value.
>>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>>>> "we" = user space.
>>>>>>> To keep the consistency among all the config space fields.
>>>>>> Actually I looked and the code some more and I'm puzzled:
>>>>>>
>>>>>>
>>>>>>          struct virtio_net_config config = {};
>>>>>>          u64 features;
>>>>>>          u16 val_u16;
>>>>>>
>>>>>>          vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>>
>>>>>>          if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
>>>>>>                      config.mac))
>>>>>>                  return -EMSGSIZE;
>>>>>>
>>>>>>
>>>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>>>
>>>>>>
>>>>>>          val_u16 = le16_to_cpu(config.status);
>>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>                  return -EMSGSIZE;
>>>>>>
>>>>>>
>>>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>>>
>>>>>>          val_u16 = le16_to_cpu(config.mtu);
>>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>                  return -EMSGSIZE;
>>>>>>
>>>>>>
>>>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>>>
>>>>>>
>>>>>> What's going on here?
>>>>> Probably too late to fix, but this should be fine as long as all
>>>>> parents support STATUS/MTU/MAC.
>>>> Why is this too late to fix.
>>> If we make this conditional on the features. This may break the
>>> userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
>>>
>>> Thanks
>> Well only on devices without MTU. I'm saying said userspace
>> was reading trash on such devices anyway.
> It depends on the parent actually. For example, mlx5 query the lower
> mtu unconditionally:
>
>          err = query_mtu(mdev, &mtu);
>          if (err)
>                  goto err_alloc;
>
>          ndev->config.mtu = cpu_to_mlx5vdpa16(mvdev, mtu);
>
> Supporting MTU features seems to be a must for real hardware.
> Otherwise the driver may not work correctly.
>
>> We don't generally maintain bug for bug compatiblity on a whim,
>> only if userspace is actually known to break if we fix a bug.
>   So I think it should be fine to make this conditional then we should
> have a consistent handling of other fields like MQ.
For some fields that have a default value, like MQ =1, we can return the 
default value.
For other fields without a default value, like MAC, we return nothing.

Does this sounds good? So, for MTU, if without _F_MTU, I think we can 
return 1500 by default.

Thanks,
Zhu Lingshan
>
> Thanks
>
>>
>>>>> I wonder if we can add a check in the core and fail the device
>>>>> registration in this case.
>>>>>
>>>>> Thanks
>>>>>
>>>>>>
>>>>>> --
>>>>>> MST
>>>>>>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-28  3:46                                         ` Zhu, Lingshan
@ 2022-07-28  5:53                                           ` Jason Wang
  2022-07-28  6:02                                             ` Zhu, Lingshan
  2022-07-28  6:41                                             ` Michael S. Tsirkin
  0 siblings, 2 replies; 113+ messages in thread
From: Jason Wang @ 2022-07-28  5:53 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: Michael S. Tsirkin, Parav Pandit, virtualization, netdev,
	xieyongji, gautam.dawar

On Thu, Jul 28, 2022 at 11:47 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
>
>
>
> On 7/28/2022 9:21 AM, Jason Wang wrote:
> > On Wed, Jul 27, 2022 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >> On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
> >>> On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >>>> On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> >>>>> On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> >>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
> >>>>>>>>
> >>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
> >>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
> >>>>>>>>>>
> >>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> >>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
> >>>>>>>>>>>>> When the user space which invokes netlink commands, detects that
> >>>>>>>>>> _MQ
> >>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> >>>>>>>>>>>> I think the kernel module have all necessary information and it is
> >>>>>>>>>>>> the only one which have precise information of a device, so it
> >>>>>>>>>>>> should answer precisely than let the user space guess. The kernel
> >>>>>>>>>>>> module should be reliable than stay silent, leave the question to
> >>>>>>>>>>>> the user space
> >>>>>>>>>> tool.
> >>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
> >>>>>>>>>>> field doesn’t
> >>>>>>>>>> exist regardless of field should have default or no default.
> >>>>>>>>>> so when you know it is one queue pair, you should answer one, not try
> >>>>>>>>>> to guess.
> >>>>>>>>>>> User space should not guess either. User space gets to see if _MQ
> >>>>>>>>>> present/not present. If _MQ present than get reliable data from kernel.
> >>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
> >>>>>>>>>> it is still a guess, right? And all user space tools implemented this
> >>>>>>>>>> feature need to guess
> >>>>>>>>> No. it is not a guess.
> >>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
> >>>>>>>>> The code you proposed will be present in the user space.
> >>>>>>>>> It will be uniform for _MQ and 10 other features that are present now and
> >>>>>>>> in the future.
> >>>>>>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
> >>>>>>>> are no attributes like max_rss_key_size, and there is not a default value.
> >>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
> >>>>>>> "we" = user space.
> >>>>>>> To keep the consistency among all the config space fields.
> >>>>>> Actually I looked and the code some more and I'm puzzled:
> >>>>>>
> >>>>>>
> >>>>>>          struct virtio_net_config config = {};
> >>>>>>          u64 features;
> >>>>>>          u16 val_u16;
> >>>>>>
> >>>>>>          vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> >>>>>>
> >>>>>>          if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> >>>>>>                      config.mac))
> >>>>>>                  return -EMSGSIZE;
> >>>>>>
> >>>>>>
> >>>>>> Mac returned even without VIRTIO_NET_F_MAC
> >>>>>>
> >>>>>>
> >>>>>>          val_u16 = le16_to_cpu(config.status);
> >>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> >>>>>>                  return -EMSGSIZE;
> >>>>>>
> >>>>>>
> >>>>>> status returned even without VIRTIO_NET_F_STATUS
> >>>>>>
> >>>>>>          val_u16 = le16_to_cpu(config.mtu);
> >>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> >>>>>>                  return -EMSGSIZE;
> >>>>>>
> >>>>>>
> >>>>>> MTU returned even without VIRTIO_NET_F_MTU
> >>>>>>
> >>>>>>
> >>>>>> What's going on here?
> >>>>> Probably too late to fix, but this should be fine as long as all
> >>>>> parents support STATUS/MTU/MAC.
> >>>> Why is this too late to fix.
> >>> If we make this conditional on the features. This may break the
> >>> userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
> >>>
> >>> Thanks
> >> Well only on devices without MTU. I'm saying said userspace
> >> was reading trash on such devices anyway.
> > It depends on the parent actually. For example, mlx5 query the lower
> > mtu unconditionally:
> >
> >          err = query_mtu(mdev, &mtu);
> >          if (err)
> >                  goto err_alloc;
> >
> >          ndev->config.mtu = cpu_to_mlx5vdpa16(mvdev, mtu);
> >
> > Supporting MTU features seems to be a must for real hardware.
> > Otherwise the driver may not work correctly.
> >
> >> We don't generally maintain bug for bug compatiblity on a whim,
> >> only if userspace is actually known to break if we fix a bug.
> >   So I think it should be fine to make this conditional then we should
> > have a consistent handling of other fields like MQ.
> For some fields that have a default value, like MQ =1, we can return the
> default value.
> For other fields without a default value, like MAC, we return nothing.
>
> Does this sounds good? So, for MTU, if without _F_MTU, I think we can
> return 1500 by default.

Or we can just read MTU from the device.

But It looks to me Michael wants it conditional.

Thanks

>
> Thanks,
> Zhu Lingshan
> >
> > Thanks
> >
> >>
> >>>>> I wonder if we can add a check in the core and fail the device
> >>>>> registration in this case.
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>>>
> >>>>>> --
> >>>>>> MST
> >>>>>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-28  5:53                                           ` Jason Wang
@ 2022-07-28  6:02                                             ` Zhu, Lingshan
  2022-07-28  6:41                                             ` Michael S. Tsirkin
  1 sibling, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-28  6:02 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Parav Pandit, virtualization, netdev,
	xieyongji, gautam.dawar



On 7/28/2022 1:53 PM, Jason Wang wrote:
> On Thu, Jul 28, 2022 at 11:47 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
>>
>>
>> On 7/28/2022 9:21 AM, Jason Wang wrote:
>>> On Wed, Jul 27, 2022 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>> On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
>>>>> On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>> On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
>>>>>>> On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>>>>>
>>>>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>>>>>> When the user space which invokes netlink commands, detects that
>>>>>>>>>>>> _MQ
>>>>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>>>>>>>>> I think the kernel module have all necessary information and it is
>>>>>>>>>>>>>> the only one which have precise information of a device, so it
>>>>>>>>>>>>>> should answer precisely than let the user space guess. The kernel
>>>>>>>>>>>>>> module should be reliable than stay silent, leave the question to
>>>>>>>>>>>>>> the user space
>>>>>>>>>>>> tool.
>>>>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>>>>>>>>>> field doesn’t
>>>>>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>>>>>> so when you know it is one queue pair, you should answer one, not try
>>>>>>>>>>>> to guess.
>>>>>>>>>>>>> User space should not guess either. User space gets to see if _MQ
>>>>>>>>>>>> present/not present. If _MQ present than get reliable data from kernel.
>>>>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>>>>>> it is still a guess, right? And all user space tools implemented this
>>>>>>>>>>>> feature need to guess
>>>>>>>>>>> No. it is not a guess.
>>>>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>>>>>>>>> The code you proposed will be present in the user space.
>>>>>>>>>>> It will be uniform for _MQ and 10 other features that are present now and
>>>>>>>>>> in the future.
>>>>>>>>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
>>>>>>>>>> are no attributes like max_rss_key_size, and there is not a default value.
>>>>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>>>>>> "we" = user space.
>>>>>>>>> To keep the consistency among all the config space fields.
>>>>>>>> Actually I looked and the code some more and I'm puzzled:
>>>>>>>>
>>>>>>>>
>>>>>>>>           struct virtio_net_config config = {};
>>>>>>>>           u64 features;
>>>>>>>>           u16 val_u16;
>>>>>>>>
>>>>>>>>           vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>>>>
>>>>>>>>           if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
>>>>>>>>                       config.mac))
>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>
>>>>>>>>
>>>>>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>>>>>
>>>>>>>>
>>>>>>>>           val_u16 = le16_to_cpu(config.status);
>>>>>>>>           if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>
>>>>>>>>
>>>>>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>>>>>
>>>>>>>>           val_u16 = le16_to_cpu(config.mtu);
>>>>>>>>           if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>
>>>>>>>>
>>>>>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>>>>>
>>>>>>>>
>>>>>>>> What's going on here?
>>>>>>> Probably too late to fix, but this should be fine as long as all
>>>>>>> parents support STATUS/MTU/MAC.
>>>>>> Why is this too late to fix.
>>>>> If we make this conditional on the features. This may break the
>>>>> userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
>>>>>
>>>>> Thanks
>>>> Well only on devices without MTU. I'm saying said userspace
>>>> was reading trash on such devices anyway.
>>> It depends on the parent actually. For example, mlx5 query the lower
>>> mtu unconditionally:
>>>
>>>           err = query_mtu(mdev, &mtu);
>>>           if (err)
>>>                   goto err_alloc;
>>>
>>>           ndev->config.mtu = cpu_to_mlx5vdpa16(mvdev, mtu);
>>>
>>> Supporting MTU features seems to be a must for real hardware.
>>> Otherwise the driver may not work correctly.
>>>
>>>> We don't generally maintain bug for bug compatiblity on a whim,
>>>> only if userspace is actually known to break if we fix a bug.
>>>    So I think it should be fine to make this conditional then we should
>>> have a consistent handling of other fields like MQ.
>> For some fields that have a default value, like MQ =1, we can return the
>> default value.
>> For other fields without a default value, like MAC, we return nothing.
>>
>> Does this sounds good? So, for MTU, if without _F_MTU, I think we can
>> return 1500 by default.
> Or we can just read MTU from the device.
>
> But It looks to me Michael wants it conditional.
if _F_MTU is offered, we can read it from the device config space. If 
_F_MTU not
offered, I think it is conditional, however there can be a min default 
value,
1500 for Ethernet.

Thanks,
Zhu Lingshan
>
> Thanks
>
>> Thanks,
>> Zhu Lingshan
>>> Thanks
>>>
>>>>>>> I wonder if we can add a check in the core and fail the device
>>>>>>> registration in this case.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>> --
>>>>>>>> MST
>>>>>>>>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-28  5:53                                           ` Jason Wang
  2022-07-28  6:02                                             ` Zhu, Lingshan
@ 2022-07-28  6:41                                             ` Michael S. Tsirkin
  2022-08-01  4:50                                               ` Jason Wang
  1 sibling, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-28  6:41 UTC (permalink / raw)
  To: Jason Wang
  Cc: Zhu, Lingshan, Parav Pandit, virtualization, netdev, xieyongji,
	gautam.dawar

On Thu, Jul 28, 2022 at 01:53:51PM +0800, Jason Wang wrote:
> On Thu, Jul 28, 2022 at 11:47 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
> >
> >
> >
> > On 7/28/2022 9:21 AM, Jason Wang wrote:
> > > On Wed, Jul 27, 2022 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >> On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
> > >>> On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >>>> On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> > >>>>> On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > >>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
> > >>>>>>>>
> > >>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > >>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
> > >>>>>>>>>>
> > >>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > >>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > >>>>>>>>>>>>> When the user space which invokes netlink commands, detects that
> > >>>>>>>>>> _MQ
> > >>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > >>>>>>>>>>>> I think the kernel module have all necessary information and it is
> > >>>>>>>>>>>> the only one which have precise information of a device, so it
> > >>>>>>>>>>>> should answer precisely than let the user space guess. The kernel
> > >>>>>>>>>>>> module should be reliable than stay silent, leave the question to
> > >>>>>>>>>>>> the user space
> > >>>>>>>>>> tool.
> > >>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
> > >>>>>>>>>>> field doesn’t
> > >>>>>>>>>> exist regardless of field should have default or no default.
> > >>>>>>>>>> so when you know it is one queue pair, you should answer one, not try
> > >>>>>>>>>> to guess.
> > >>>>>>>>>>> User space should not guess either. User space gets to see if _MQ
> > >>>>>>>>>> present/not present. If _MQ present than get reliable data from kernel.
> > >>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
> > >>>>>>>>>> it is still a guess, right? And all user space tools implemented this
> > >>>>>>>>>> feature need to guess
> > >>>>>>>>> No. it is not a guess.
> > >>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
> > >>>>>>>>> The code you proposed will be present in the user space.
> > >>>>>>>>> It will be uniform for _MQ and 10 other features that are present now and
> > >>>>>>>> in the future.
> > >>>>>>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
> > >>>>>>>> are no attributes like max_rss_key_size, and there is not a default value.
> > >>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
> > >>>>>>> "we" = user space.
> > >>>>>>> To keep the consistency among all the config space fields.
> > >>>>>> Actually I looked and the code some more and I'm puzzled:
> > >>>>>>
> > >>>>>>
> > >>>>>>          struct virtio_net_config config = {};
> > >>>>>>          u64 features;
> > >>>>>>          u16 val_u16;
> > >>>>>>
> > >>>>>>          vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > >>>>>>
> > >>>>>>          if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > >>>>>>                      config.mac))
> > >>>>>>                  return -EMSGSIZE;
> > >>>>>>
> > >>>>>>
> > >>>>>> Mac returned even without VIRTIO_NET_F_MAC
> > >>>>>>
> > >>>>>>
> > >>>>>>          val_u16 = le16_to_cpu(config.status);
> > >>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > >>>>>>                  return -EMSGSIZE;
> > >>>>>>
> > >>>>>>
> > >>>>>> status returned even without VIRTIO_NET_F_STATUS
> > >>>>>>
> > >>>>>>          val_u16 = le16_to_cpu(config.mtu);
> > >>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > >>>>>>                  return -EMSGSIZE;
> > >>>>>>
> > >>>>>>
> > >>>>>> MTU returned even without VIRTIO_NET_F_MTU
> > >>>>>>
> > >>>>>>
> > >>>>>> What's going on here?
> > >>>>> Probably too late to fix, but this should be fine as long as all
> > >>>>> parents support STATUS/MTU/MAC.
> > >>>> Why is this too late to fix.
> > >>> If we make this conditional on the features. This may break the
> > >>> userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
> > >>>
> > >>> Thanks
> > >> Well only on devices without MTU. I'm saying said userspace
> > >> was reading trash on such devices anyway.
> > > It depends on the parent actually. For example, mlx5 query the lower
> > > mtu unconditionally:
> > >
> > >          err = query_mtu(mdev, &mtu);
> > >          if (err)
> > >                  goto err_alloc;
> > >
> > >          ndev->config.mtu = cpu_to_mlx5vdpa16(mvdev, mtu);
> > >
> > > Supporting MTU features seems to be a must for real hardware.
> > > Otherwise the driver may not work correctly.
> > >
> > >> We don't generally maintain bug for bug compatiblity on a whim,
> > >> only if userspace is actually known to break if we fix a bug.
> > >   So I think it should be fine to make this conditional then we should
> > > have a consistent handling of other fields like MQ.
> > For some fields that have a default value, like MQ =1, we can return the
> > default value.
> > For other fields without a default value, like MAC, we return nothing.
> >
> > Does this sounds good? So, for MTU, if without _F_MTU, I think we can
> > return 1500 by default.
> 
> Or we can just read MTU from the device.
> 
> But It looks to me Michael wants it conditional.
> 
> Thanks

I'm fine either way but let's keep it consistent. And I think
Parav wants it conditional.

> >
> > Thanks,
> > Zhu Lingshan
> > >
> > > Thanks
> > >
> > >>
> > >>>>> I wonder if we can add a check in the core and fail the device
> > >>>>> registration in this case.
> > >>>>>
> > >>>>> Thanks
> > >>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>> MST
> > >>>>>>
> >


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-07-28  2:06           ` Jason Wang
@ 2022-07-28  7:08             ` Si-Wei Liu
  2022-07-28  7:36               ` Jason Wang
  2022-07-28 11:35               ` [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space Michael S. Tsirkin
  0 siblings, 2 replies; 113+ messages in thread
From: Si-Wei Liu @ 2022-07-28  7:08 UTC (permalink / raw)
  To: Jason Wang, Zhu, Lingshan, Parav Pandit, mst, Eli Cohen
  Cc: netdev, xieyongji, gautam.dawar, virtualization



On 7/27/2022 7:06 PM, Jason Wang wrote:
>
> 在 2022/7/28 08:56, Si-Wei Liu 写道:
>>
>>
>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>
>>>
>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>> Sorry to chime in late in the game. For some reason I couldn't get 
>>>> to most emails for this discussion (I only subscribed to the 
>>>> virtualization list), while I was taking off amongst the past few 
>>>> weeks.
>>>>
>>>> It looks to me this patch is incomplete. Noted down the way in 
>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>          features = vdev->config->get_driver_features(vdev);
>>>>          if (nla_put_u64_64bit(msg, 
>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>                                VDPA_ATTR_PAD))
>>>>                  return -EMSGSIZE;
>>>>
>>>> Making call to .get_driver_features() doesn't make sense when 
>>>> feature negotiation isn't complete. Neither should present 
>>>> negotiated_features to userspace before negotiation is done.
>>>>
>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() probably 
>>>> should not show before negotiation is done - it depends on driver 
>>>> features negotiated.
>>> I have another patch in this series introduces device_features and 
>>> will report device_features to the userspace even features 
>>> negotiation not done. Because the spec says we should allow driver 
>>> access the config space before FEATURES_OK.
>> The config space can be accessed by guest before features_ok doesn't 
>> necessarily mean the value is valid. 
>
>
> It's valid as long as the device offers the feature:
>
> "The device MUST allow reading of any device-specific configuration 
> field before FEATURES_OK is set by the driver. This includes fields 
> which are conditional on feature bits, as long as those feature bits 
> are offered by the device."
I guess this statement only conveys that the field in config space can 
be read before FEATURES_OK is set, though it does not *explicitly* 
states the validity of field.

And looking at:

"The mac address field always exists (though is only valid if 
VIRTIO_NET_F_MAC is set), and status only exists if VIRTIO_NET_F_STATUS 
is set."

It appears to me there's a border line set between "exist" and "valid". 
If I understand the spec wording correctly, a spec-conforming device 
implementation may or may not offer valid status value in the config 
space when VIRTIO_NET_F_STATUS is offered, but before the feature is 
negotiated. On the other hand, config space should contain valid mac 
address the moment VIRTIO_NET_F_MAC feature is offered, regardless being 
negotiated or not. By that, there seems to be leeway for the device 
implementation to decide when config space field may become valid, 
though for most of QEMU's software virtio devices, valid value is 
present to config space the very first moment when feature is offered.

"If the VIRTIO_NET_F_MAC feature bit is set, the configuration space mac 
entry indicates the “physical” address of the network card, otherwise 
the driver would typically generate a random local MAC address."
"If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link status 
comes from the bottom bit of status. Otherwise, the driver assumes it’s 
active."

And also there are special cases where the read of specific 
configuration space field MUST be deferred to until FEATURES_OK is set:

"If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache mode 
can be read or set through the writeback field. 0 corresponds to a 
writethrough cache, 1 to a writeback cache11. The cache mode after reset 
can be either writeback or writethrough. The actual mode can be 
determined by reading writeback after feature negotiation."
"The driver MUST NOT read writeback before setting the FEATURES_OK 
device status bit."
"If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH is not, 
the device MUST initialize writeback to 0."

Since the spec doesn't explicitly mandate the validity of each config 
space field when feature of concern is offered, to be safer we'd have to 
live with odd device implementation. I know for sure QEMU software 
devices won't for 99% of these cases, but that's not what is currently 
defined in the spec.

>
>
>> You may want to double check with Michael for what he quoted earlier:
>>> Nope:
>>>
>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>
>>> ...
>>>
>>> For optional configuration space fields, the driver MUST check that 
>>> the corresponding feature is offered
>>> before accessing that part of the configuration space.
>>
>> and how many driver bugs taking wrong assumption of the validity of 
>> config space field without features_ok. I am not sure what use case 
>> you want to expose config resister values for before features_ok, if 
>> it's mostly for live migration I guess it's probably heading a wrong 
>> direction.
>
>
> I guess it's not for migration. 
Then what's the other possible use case than live migration, were to 
expose config space values? Troubleshooting config space discrepancy 
between vDPA and the emulated virtio device in userspace? Or tracking 
changes in config space across feature negotiation, but for what? It'd 
be beneficial to the interface design if the specific use case can be 
clearly described...


> For migration, a provision with the correct features/capability would 
> be sufficient.
Right, that's what I thought too. It doesn't need to expose config space 
values, simply exporting all attributes for vdpa device creation will do 
the work.

-Siwei

>
> Thanks
>
>
>>
>>
>>>>
>>>>
>>>> Last but not the least, this "vdpa dev config" command was not 
>>>> designed to display the real config space register values in the 
>>>> first place. Quoting the vdpa-dev(8) man page:
>>>>
>>>>> vdpa dev config show - Show configuration of specific device or 
>>>>> all devices.
>>>>> DEV - specifies the vdpa device to show its configuration. If this 
>>>>> argument is omitted all devices configuration is listed.
>>>> It doesn't say anything about configuration space or register 
>>>> values in config space. As long as it can convey the config 
>>>> attribute when instantiating vDPA device instance, and more 
>>>> importantly, the config can be easily imported from or exported to 
>>>> userspace tools when trying to reconstruct vdpa instance intact on 
>>>> destination host for live migration, IMHO in my personal 
>>>> interpretation it doesn't matter what the config space may present. 
>>>> It may be worth while adding a new debug command to expose the real 
>>>> register value, but that's another story.
>>> I am not sure getting your points. vDPA now reports device feature 
>>> bits(device_features) and negotiated feature bits(driver_features), 
>>> and yes, the drivers features can be a subset of the device 
>>> features; and the vDPA device features can be a subset of the 
>>> management device features.
>> What I said is after unblocking the conditional check, you'd have to 
>> handle the case for each of the vdpa attribute when feature 
>> negotiation is not yet done: basically the register values you got 
>> from config space via the vdpa_get_config_unlocked() call is not 
>> considered to be valid before features_ok (per-spec). Although in 
>> some case you may get sane value, such behavior is generally 
>> undefined. If you desire to show just the device_features alone 
>> without any config space field, which the device had advertised 
>> *before feature negotiation is complete*, that'll be fine. But looks 
>> to me this is not how patch has been implemented. Probably need some 
>> more work?
>>
>> Regards,
>> -Siwei
>>
>>>>
>>>> Having said, please consider to drop the Fixes tag, as appears to 
>>>> me you're proposing a new feature rather than fixing a real issue.
>>> it's a new feature to report the device feature bits than only 
>>> negotiated features, however this patch is a must, or it will block 
>>> the device feature bits reporting. but I agree, the fix tag is not a 
>>> must.
>>>>
>>>> Thanks,
>>>> -Siwei
>>>>
>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>
>>>>>> Users may want to query the config space of a vDPA device, to 
>>>>>> choose a
>>>>>> appropriate one for a certain guest. This means the users need to 
>>>>>> read the
>>>>>> config space before FEATURES_OK, and the existence of config space
>>>>>> contents does not depend on FEATURES_OK.
>>>>>>
>>>>>> The spec says:
>>>>>> The device MUST allow reading of any device-specific 
>>>>>> configuration field
>>>>>> before FEATURES_OK is set by the driver. This includes fields 
>>>>>> which are
>>>>>> conditional on feature bits, as long as those feature bits are 
>>>>>> offered by the
>>>>>> device.
>>>>>>
>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if 
>>>>>> FEATURES_OK)
>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>
>>>>> Above commit id is 13 letters should be 12.
>>>>> And
>>>>> It should be in format
>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if 
>>>>> FEATURES_OK")
>>>>>
>>>>> Please use checkpatch.pl script before posting the patches to 
>>>>> catch these errors.
>>>>> There is a bot that looks at the fixes tag and identifies the 
>>>>> right kernel version to apply this fix.
>>>>>
>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>> ---
>>>>>>   drivers/vdpa/vdpa.c | 8 --------
>>>>>>   1 file changed, 8 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>       u32 device_id;
>>>>>>       void *hdr;
>>>>>> -    u8 status;
>>>>>>       int err;
>>>>>>
>>>>>>       down_read(&vdev->cf_lock);
>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>> completed");
>>>>>> -        err = -EAGAIN;
>>>>>> -        goto out;
>>>>>> -    }
>>>>>> -
>>>>>>       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>>>>                 VDPA_CMD_DEV_CONFIG_GET);
>>>>>>       if (!hdr) {
>>>>>> -- 
>>>>>> 2.31.1
>>>>> _______________________________________________
>>>>> Virtualization mailing list
>>>>> Virtualization@lists.linux-foundation.org
>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!Pkwym7OAjoDucUqs2fAwchxqL8-BGd6wOl-51xcgB_yCNwPJ_cs8A1y-cYmrLTB4OBNsimnZuqJPcvQIl3g$ 
>>>>
>>>>
>>>
>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-07-28  7:08             ` Si-Wei Liu
@ 2022-07-28  7:36               ` Jason Wang
  2022-07-28  7:44                 ` Zhu, Lingshan
       [not found]                 ` <2dfff5f3-3100-4a63-6da3-3e3d21ffb364@oracle.com>
  2022-07-28 11:35               ` [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space Michael S. Tsirkin
  1 sibling, 2 replies; 113+ messages in thread
From: Jason Wang @ 2022-07-28  7:36 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: Zhu, Lingshan, Parav Pandit, mst, Eli Cohen, netdev, xieyongji,
	gautam.dawar, virtualization

On Thu, Jul 28, 2022 at 3:09 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
>
>
> On 7/27/2022 7:06 PM, Jason Wang wrote:
> >
> > 在 2022/7/28 08:56, Si-Wei Liu 写道:
> >>
> >>
> >> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
> >>>
> >>>
> >>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
> >>>> Sorry to chime in late in the game. For some reason I couldn't get
> >>>> to most emails for this discussion (I only subscribed to the
> >>>> virtualization list), while I was taking off amongst the past few
> >>>> weeks.
> >>>>
> >>>> It looks to me this patch is incomplete. Noted down the way in
> >>>> vdpa_dev_net_config_fill(), we have the following:
> >>>>          features = vdev->config->get_driver_features(vdev);
> >>>>          if (nla_put_u64_64bit(msg,
> >>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
> >>>>                                VDPA_ATTR_PAD))
> >>>>                  return -EMSGSIZE;
> >>>>
> >>>> Making call to .get_driver_features() doesn't make sense when
> >>>> feature negotiation isn't complete. Neither should present
> >>>> negotiated_features to userspace before negotiation is done.
> >>>>
> >>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() probably
> >>>> should not show before negotiation is done - it depends on driver
> >>>> features negotiated.
> >>> I have another patch in this series introduces device_features and
> >>> will report device_features to the userspace even features
> >>> negotiation not done. Because the spec says we should allow driver
> >>> access the config space before FEATURES_OK.
> >> The config space can be accessed by guest before features_ok doesn't
> >> necessarily mean the value is valid.
> >
> >
> > It's valid as long as the device offers the feature:
> >
> > "The device MUST allow reading of any device-specific configuration
> > field before FEATURES_OK is set by the driver. This includes fields
> > which are conditional on feature bits, as long as those feature bits
> > are offered by the device."
> I guess this statement only conveys that the field in config space can
> be read before FEATURES_OK is set, though it does not *explicitly*
> states the validity of field.

My understanding is that it should be valid as long as the device
offers the feature.

For example, if _MQ is offered by device, the max_virt_queue_pairs is
always valid and can be read from the driver no matter whether _MQ is
negotiated.

>
> And looking at:
>
> "The mac address field always exists (though is only valid if
> VIRTIO_NET_F_MAC is set), and status only exists if VIRTIO_NET_F_STATUS
> is set."
>
> It appears to me there's a border line set between "exist" and "valid".
> If I understand the spec wording correctly, a spec-conforming device
> implementation may or may not offer valid status value in the config
> space when VIRTIO_NET_F_STATUS is offered, but before the feature is
> negotiated.

That's not what I read, maybe Michael can clarify this.

> On the other hand, config space should contain valid mac
> address the moment VIRTIO_NET_F_MAC feature is offered, regardless being
> negotiated or not.

I agree here.

>By that, there seems to be leeway for the device
> implementation to decide when config space field may become valid,
> though for most of QEMU's software virtio devices, valid value is
> present to config space the very first moment when feature is offered.
>
> "If the VIRTIO_NET_F_MAC feature bit is set, the configuration space mac
> entry indicates the “physical” address of the network card, otherwise
> the driver would typically generate a random local MAC address."
> "If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link status
> comes from the bottom bit of status. Otherwise, the driver assumes it’s
> active."

This is mostly the way how drivers that don't support _F_STATUS work.

>
> And also there are special cases where the read of specific
> configuration space field MUST be deferred to until FEATURES_OK is set:
>
> "If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache mode
> can be read or set through the writeback field. 0 corresponds to a
> writethrough cache, 1 to a writeback cache11. The cache mode after reset
> can be either writeback or writethrough. The actual mode can be
> determined by reading writeback after feature negotiation."
> "The driver MUST NOT read writeback before setting the FEATURES_OK
> device status bit."

This seems to conflict with the normatives I quoted above, and I don't
get why we need this.

> "If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH is not,
> the device MUST initialize writeback to 0."
>
> Since the spec doesn't explicitly mandate the validity of each config
> space field when feature of concern is offered, to be safer we'd have to
> live with odd device implementation. I know for sure QEMU software
> devices won't for 99% of these cases, but that's not what is currently
> defined in the spec.
>
> >
> >
> >> You may want to double check with Michael for what he quoted earlier:
> >>> Nope:
> >>>
> >>> 2.5.1  Driver Requirements: Device Configuration Space
> >>>
> >>> ...
> >>>
> >>> For optional configuration space fields, the driver MUST check that
> >>> the corresponding feature is offered
> >>> before accessing that part of the configuration space.
> >>
> >> and how many driver bugs taking wrong assumption of the validity of
> >> config space field without features_ok. I am not sure what use case
> >> you want to expose config resister values for before features_ok, if
> >> it's mostly for live migration I guess it's probably heading a wrong
> >> direction.
> >
> >
> > I guess it's not for migration.
> Then what's the other possible use case than live migration, were to
> expose config space values? Troubleshooting config space discrepancy
> between vDPA and the emulated virtio device in userspace? Or tracking
> changes in config space across feature negotiation, but for what? It'd
> be beneficial to the interface design if the specific use case can be
> clearly described...

Monitoring or debugging I guess.

Thanks

>
>
> > For migration, a provision with the correct features/capability would
> > be sufficient.
> Right, that's what I thought too. It doesn't need to expose config space
> values, simply exporting all attributes for vdpa device creation will do
> the work.
>
> -Siwei
>
> >
> > Thanks
> >
> >
> >>
> >>
> >>>>
> >>>>
> >>>> Last but not the least, this "vdpa dev config" command was not
> >>>> designed to display the real config space register values in the
> >>>> first place. Quoting the vdpa-dev(8) man page:
> >>>>
> >>>>> vdpa dev config show - Show configuration of specific device or
> >>>>> all devices.
> >>>>> DEV - specifies the vdpa device to show its configuration. If this
> >>>>> argument is omitted all devices configuration is listed.
> >>>> It doesn't say anything about configuration space or register
> >>>> values in config space. As long as it can convey the config
> >>>> attribute when instantiating vDPA device instance, and more
> >>>> importantly, the config can be easily imported from or exported to
> >>>> userspace tools when trying to reconstruct vdpa instance intact on
> >>>> destination host for live migration, IMHO in my personal
> >>>> interpretation it doesn't matter what the config space may present.
> >>>> It may be worth while adding a new debug command to expose the real
> >>>> register value, but that's another story.
> >>> I am not sure getting your points. vDPA now reports device feature
> >>> bits(device_features) and negotiated feature bits(driver_features),
> >>> and yes, the drivers features can be a subset of the device
> >>> features; and the vDPA device features can be a subset of the
> >>> management device features.
> >> What I said is after unblocking the conditional check, you'd have to
> >> handle the case for each of the vdpa attribute when feature
> >> negotiation is not yet done: basically the register values you got
> >> from config space via the vdpa_get_config_unlocked() call is not
> >> considered to be valid before features_ok (per-spec). Although in
> >> some case you may get sane value, such behavior is generally
> >> undefined. If you desire to show just the device_features alone
> >> without any config space field, which the device had advertised
> >> *before feature negotiation is complete*, that'll be fine. But looks
> >> to me this is not how patch has been implemented. Probably need some
> >> more work?
> >>
> >> Regards,
> >> -Siwei
> >>
> >>>>
> >>>> Having said, please consider to drop the Fixes tag, as appears to
> >>>> me you're proposing a new feature rather than fixing a real issue.
> >>> it's a new feature to report the device feature bits than only
> >>> negotiated features, however this patch is a must, or it will block
> >>> the device feature bits reporting. but I agree, the fix tag is not a
> >>> must.
> >>>>
> >>>> Thanks,
> >>>> -Siwei
> >>>>
> >>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
> >>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
> >>>>>> Sent: Friday, July 1, 2022 9:28 AM
> >>>>>>
> >>>>>> Users may want to query the config space of a vDPA device, to
> >>>>>> choose a
> >>>>>> appropriate one for a certain guest. This means the users need to
> >>>>>> read the
> >>>>>> config space before FEATURES_OK, and the existence of config space
> >>>>>> contents does not depend on FEATURES_OK.
> >>>>>>
> >>>>>> The spec says:
> >>>>>> The device MUST allow reading of any device-specific
> >>>>>> configuration field
> >>>>>> before FEATURES_OK is set by the driver. This includes fields
> >>>>>> which are
> >>>>>> conditional on feature bits, as long as those feature bits are
> >>>>>> offered by the
> >>>>>> device.
> >>>>>>
> >>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if
> >>>>>> FEATURES_OK)
> >>>>> Fix is fine, but fixes tag needs correction described below.
> >>>>>
> >>>>> Above commit id is 13 letters should be 12.
> >>>>> And
> >>>>> It should be in format
> >>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if
> >>>>> FEATURES_OK")
> >>>>>
> >>>>> Please use checkpatch.pl script before posting the patches to
> >>>>> catch these errors.
> >>>>> There is a bot that looks at the fixes tag and identifies the
> >>>>> right kernel version to apply this fix.
> >>>>>
> >>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
> >>>>>> ---
> >>>>>>   drivers/vdpa/vdpa.c | 8 --------
> >>>>>>   1 file changed, 8 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> >>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
> >>>>>> --- a/drivers/vdpa/vdpa.c
> >>>>>> +++ b/drivers/vdpa/vdpa.c
> >>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
> >>>>>> struct sk_buff *msg, u32 portid,  {
> >>>>>>       u32 device_id;
> >>>>>>       void *hdr;
> >>>>>> -    u8 status;
> >>>>>>       int err;
> >>>>>>
> >>>>>>       down_read(&vdev->cf_lock);
> >>>>>> -    status = vdev->config->get_status(vdev);
> >>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
> >>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
> >>>>>> completed");
> >>>>>> -        err = -EAGAIN;
> >>>>>> -        goto out;
> >>>>>> -    }
> >>>>>> -
> >>>>>>       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
> >>>>>>                 VDPA_CMD_DEV_CONFIG_GET);
> >>>>>>       if (!hdr) {
> >>>>>> --
> >>>>>> 2.31.1
> >>>>> _______________________________________________
> >>>>> Virtualization mailing list
> >>>>> Virtualization@lists.linux-foundation.org
> >>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!Pkwym7OAjoDucUqs2fAwchxqL8-BGd6wOl-51xcgB_yCNwPJ_cs8A1y-cYmrLTB4OBNsimnZuqJPcvQIl3g$
> >>>>
> >>>>
> >>>
> >>
> >
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-07-28  7:36               ` Jason Wang
@ 2022-07-28  7:44                 ` Zhu, Lingshan
       [not found]                 ` <2dfff5f3-3100-4a63-6da3-3e3d21ffb364@oracle.com>
  1 sibling, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-28  7:44 UTC (permalink / raw)
  To: Jason Wang, Si-Wei Liu
  Cc: Parav Pandit, mst, Eli Cohen, netdev, xieyongji, gautam.dawar,
	virtualization



On 7/28/2022 3:36 PM, Jason Wang wrote:
> On Thu, Jul 28, 2022 at 3:09 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>
>>
>> On 7/27/2022 7:06 PM, Jason Wang wrote:
>>> 在 2022/7/28 08:56, Si-Wei Liu 写道:
>>>>
>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>
>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>> Sorry to chime in late in the game. For some reason I couldn't get
>>>>>> to most emails for this discussion (I only subscribed to the
>>>>>> virtualization list), while I was taking off amongst the past few
>>>>>> weeks.
>>>>>>
>>>>>> It looks to me this patch is incomplete. Noted down the way in
>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>           features = vdev->config->get_driver_features(vdev);
>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>                                 VDPA_ATTR_PAD))
>>>>>>                   return -EMSGSIZE;
>>>>>>
>>>>>> Making call to .get_driver_features() doesn't make sense when
>>>>>> feature negotiation isn't complete. Neither should present
>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>
>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() probably
>>>>>> should not show before negotiation is done - it depends on driver
>>>>>> features negotiated.
>>>>> I have another patch in this series introduces device_features and
>>>>> will report device_features to the userspace even features
>>>>> negotiation not done. Because the spec says we should allow driver
>>>>> access the config space before FEATURES_OK.
>>>> The config space can be accessed by guest before features_ok doesn't
>>>> necessarily mean the value is valid.
>>>
>>> It's valid as long as the device offers the feature:
>>>
>>> "The device MUST allow reading of any device-specific configuration
>>> field before FEATURES_OK is set by the driver. This includes fields
>>> which are conditional on feature bits, as long as those feature bits
>>> are offered by the device."
>> I guess this statement only conveys that the field in config space can
>> be read before FEATURES_OK is set, though it does not *explicitly*
>> states the validity of field.
> My understanding is that it should be valid as long as the device
> offers the feature.
>
> For example, if _MQ is offered by device, the max_virt_queue_pairs is
> always valid and can be read from the driver no matter whether _MQ is
> negotiated.
agreed, that's also my understanding
>
>> And looking at:
>>
>> "The mac address field always exists (though is only valid if
>> VIRTIO_NET_F_MAC is set), and status only exists if VIRTIO_NET_F_STATUS
>> is set."
>>
>> It appears to me there's a border line set between "exist" and "valid".
>> If I understand the spec wording correctly, a spec-conforming device
>> implementation may or may not offer valid status value in the config
>> space when VIRTIO_NET_F_STATUS is offered, but before the feature is
>> negotiated.
> That's not what I read, maybe Michael can clarify this.
>
>> On the other hand, config space should contain valid mac
>> address the moment VIRTIO_NET_F_MAC feature is offered, regardless being
>> negotiated or not.
> I agree here.
>
>> By that, there seems to be leeway for the device
>> implementation to decide when config space field may become valid,
>> though for most of QEMU's software virtio devices, valid value is
>> present to config space the very first moment when feature is offered.
>>
>> "If the VIRTIO_NET_F_MAC feature bit is set, the configuration space mac
>> entry indicates the “physical” address of the network card, otherwise
>> the driver would typically generate a random local MAC address."
>> "If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link status
>> comes from the bottom bit of status. Otherwise, the driver assumes it’s
>> active."
> This is mostly the way how drivers that don't support _F_STATUS work.
>
>> And also there are special cases where the read of specific
>> configuration space field MUST be deferred to until FEATURES_OK is set:
>>
>> "If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache mode
>> can be read or set through the writeback field. 0 corresponds to a
>> writethrough cache, 1 to a writeback cache11. The cache mode after reset
>> can be either writeback or writethrough. The actual mode can be
>> determined by reading writeback after feature negotiation."
>> "The driver MUST NOT read writeback before setting the FEATURES_OK
>> device status bit."
> This seems to conflict with the normatives I quoted above, and I don't
> get why we need this.
>
>> "If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH is not,
>> the device MUST initialize writeback to 0."
>>
>> Since the spec doesn't explicitly mandate the validity of each config
>> space field when feature of concern is offered, to be safer we'd have to
>> live with odd device implementation. I know for sure QEMU software
>> devices won't for 99% of these cases, but that's not what is currently
>> defined in the spec.
>>
>>>
>>>> You may want to double check with Michael for what he quoted earlier:
>>>>> Nope:
>>>>>
>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>
>>>>> ...
>>>>>
>>>>> For optional configuration space fields, the driver MUST check that
>>>>> the corresponding feature is offered
>>>>> before accessing that part of the configuration space.
>>>> and how many driver bugs taking wrong assumption of the validity of
>>>> config space field without features_ok. I am not sure what use case
>>>> you want to expose config resister values for before features_ok, if
>>>> it's mostly for live migration I guess it's probably heading a wrong
>>>> direction.
>>>
>>> I guess it's not for migration.
>> Then what's the other possible use case than live migration, were to
>> expose config space values? Troubleshooting config space discrepancy
>> between vDPA and the emulated virtio device in userspace? Or tracking
>> changes in config space across feature negotiation, but for what? It'd
>> be beneficial to the interface design if the specific use case can be
>> clearly described...
> Monitoring or debugging I guess.
>
> Thanks
>
>>
>>> For migration, a provision with the correct features/capability would
>>> be sufficient.
>> Right, that's what I thought too. It doesn't need to expose config space
>> values, simply exporting all attributes for vdpa device creation will do
>> the work.
>>
>> -Siwei
>>
>>> Thanks
>>>
>>>
>>>>
>>>>>>
>>>>>> Last but not the least, this "vdpa dev config" command was not
>>>>>> designed to display the real config space register values in the
>>>>>> first place. Quoting the vdpa-dev(8) man page:
>>>>>>
>>>>>>> vdpa dev config show - Show configuration of specific device or
>>>>>>> all devices.
>>>>>>> DEV - specifies the vdpa device to show its configuration. If this
>>>>>>> argument is omitted all devices configuration is listed.
>>>>>> It doesn't say anything about configuration space or register
>>>>>> values in config space. As long as it can convey the config
>>>>>> attribute when instantiating vDPA device instance, and more
>>>>>> importantly, the config can be easily imported from or exported to
>>>>>> userspace tools when trying to reconstruct vdpa instance intact on
>>>>>> destination host for live migration, IMHO in my personal
>>>>>> interpretation it doesn't matter what the config space may present.
>>>>>> It may be worth while adding a new debug command to expose the real
>>>>>> register value, but that's another story.
>>>>> I am not sure getting your points. vDPA now reports device feature
>>>>> bits(device_features) and negotiated feature bits(driver_features),
>>>>> and yes, the drivers features can be a subset of the device
>>>>> features; and the vDPA device features can be a subset of the
>>>>> management device features.
>>>> What I said is after unblocking the conditional check, you'd have to
>>>> handle the case for each of the vdpa attribute when feature
>>>> negotiation is not yet done: basically the register values you got
>>>> from config space via the vdpa_get_config_unlocked() call is not
>>>> considered to be valid before features_ok (per-spec). Although in
>>>> some case you may get sane value, such behavior is generally
>>>> undefined. If you desire to show just the device_features alone
>>>> without any config space field, which the device had advertised
>>>> *before feature negotiation is complete*, that'll be fine. But looks
>>>> to me this is not how patch has been implemented. Probably need some
>>>> more work?
>>>>
>>>> Regards,
>>>> -Siwei
>>>>
>>>>>> Having said, please consider to drop the Fixes tag, as appears to
>>>>>> me you're proposing a new feature rather than fixing a real issue.
>>>>> it's a new feature to report the device feature bits than only
>>>>> negotiated features, however this patch is a must, or it will block
>>>>> the device feature bits reporting. but I agree, the fix tag is not a
>>>>> must.
>>>>>> Thanks,
>>>>>> -Siwei
>>>>>>
>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>
>>>>>>>> Users may want to query the config space of a vDPA device, to
>>>>>>>> choose a
>>>>>>>> appropriate one for a certain guest. This means the users need to
>>>>>>>> read the
>>>>>>>> config space before FEATURES_OK, and the existence of config space
>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>
>>>>>>>> The spec says:
>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>> configuration field
>>>>>>>> before FEATURES_OK is set by the driver. This includes fields
>>>>>>>> which are
>>>>>>>> conditional on feature bits, as long as those feature bits are
>>>>>>>> offered by the
>>>>>>>> device.
>>>>>>>>
>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if
>>>>>>>> FEATURES_OK)
>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>
>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>> And
>>>>>>> It should be in format
>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if
>>>>>>> FEATURES_OK")
>>>>>>>
>>>>>>> Please use checkpatch.pl script before posting the patches to
>>>>>>> catch these errors.
>>>>>>> There is a bot that looks at the fixes tag and identifies the
>>>>>>> right kernel version to apply this fix.
>>>>>>>
>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>> ---
>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>        u32 device_id;
>>>>>>>>        void *hdr;
>>>>>>>> -    u8 status;
>>>>>>>>        int err;
>>>>>>>>
>>>>>>>>        down_read(&vdev->cf_lock);
>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>> completed");
>>>>>>>> -        err = -EAGAIN;
>>>>>>>> -        goto out;
>>>>>>>> -    }
>>>>>>>> -
>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>>>>>>                  VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>        if (!hdr) {
>>>>>>>> --
>>>>>>>> 2.31.1
>>>>>>> _______________________________________________
>>>>>>> Virtualization mailing list
>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!Pkwym7OAjoDucUqs2fAwchxqL8-BGd6wOl-51xcgB_yCNwPJ_cs8A1y-cYmrLTB4OBNsimnZuqJPcvQIl3g$
>>>>>>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: spec clarification (was Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space)
       [not found]                 ` <2dfff5f3-3100-4a63-6da3-3e3d21ffb364@oracle.com>
@ 2022-07-28 11:28                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-28 11:28 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: Jason Wang, Zhu, Lingshan, Parav Pandit, Eli Cohen, netdev,
	xieyongji, gautam.dawar, virtualization, virtio-comment, virtio,
	Paolo Bonzini

On Thu, Jul 28, 2022 at 01:20:26AM -0700, Si-Wei Liu wrote:
> Hi Michael,
> 
> Could you please comment on the different wording between "exist" and "valid"
> in spec? Having seen quite a few relevant discussions regarding MTU validation
> and VERSION_1 negotiation on s390, I was in the impression this is not the
> first time people getting confused because of ambiguity of random wording
> without detailed example helping to clarify around the context, or due lack of
> clear definition set ahead. I like your idea to keep things consistent
> (conditional depending on feature presence), however, without proper
> interpretation of how spec is supposed to work, we are on a slippery slope
> towards inconsistency.
> 
> On 7/28/2022 12:36 AM, Jason Wang wrote:
> 
>         And looking at:
> 
>         "The mac address field always exists (though is only valid if
>         VIRTIO_NET_F_MAC is set), and status only exists if VIRTIO_NET_F_STATUS
>         is set."
> 
>         It appears to me there's a border line set between "exist" and "valid".
>         If I understand the spec wording correctly, a spec-conforming device
>         implementation may or may not offer valid status value in the config
>         space when VIRTIO_NET_F_STATUS is offered, but before the feature is
>         negotiated.
> 
>     That's not what I read, maybe Michael can clarify this.
> 
> 
> 
> And Jason and I find below normatives are conflict with each other.
> 
>         "The device MUST allow reading of any device-specific configuration
>         field before FEATURES_OK is set by the driver. This includes fields
>         which are conditional on feature bits, as long as those feature bits are
>         offered by the device."


So I proposed this back in April:

https://lists.oasis-open.org/archives/virtio-comment/202201/msg00068.html

I intended this for 1.2 but it quickly became clear it won't make it
in time. Working on reviving the proposal and addressing the comments.




> 
>     ...
> 
>         And also there are special cases where the read of specific
>         configuration space field MUST be deferred to until FEATURES_OK is set:
> 
>         "If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache mode
>         can be read or set through the writeback field. 0 corresponds to a
>         writethrough cache, 1 to a writeback cache11. The cache mode after reset
>         can be either writeback or writethrough. The actual mode can be
>         determined by reading writeback after feature negotiation."
>         "The driver MUST NOT read writeback before setting the FEATURES_OK
>         device status bit."
> 
>     This seems to conflict with the normatives I quoted above, and I don't
>     get why we need this.
> 
> 
> 
> Thanks,
> -Siwei


The last one I take to mean writeback is special.
I am not sure why it should be. Paolo you proposed this text could
you comment please?

Thanks!

-- 
MST


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-07-28  7:08             ` Si-Wei Liu
  2022-07-28  7:36               ` Jason Wang
@ 2022-07-28 11:35               ` Michael S. Tsirkin
  2022-07-28 22:12                 ` Si-Wei Liu
  1 sibling, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-28 11:35 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: Jason Wang, Zhu, Lingshan, Parav Pandit, Eli Cohen, netdev,
	xieyongji, gautam.dawar, virtualization

On Thu, Jul 28, 2022 at 12:08:53AM -0700, Si-Wei Liu wrote:
> 
> 
> On 7/27/2022 7:06 PM, Jason Wang wrote:
> > 
> > 在 2022/7/28 08:56, Si-Wei Liu 写道:
> > > 
> > > 
> > > On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
> > > > 
> > > > 
> > > > On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
> > > > > Sorry to chime in late in the game. For some reason I
> > > > > couldn't get to most emails for this discussion (I only
> > > > > subscribed to the virtualization list), while I was taking
> > > > > off amongst the past few weeks.
> > > > > 
> > > > > It looks to me this patch is incomplete. Noted down the way
> > > > > in vdpa_dev_net_config_fill(), we have the following:
> > > > >          features = vdev->config->get_driver_features(vdev);
> > > > >          if (nla_put_u64_64bit(msg,
> > > > > VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
> > > > >                                VDPA_ATTR_PAD))
> > > > >                  return -EMSGSIZE;
> > > > > 
> > > > > Making call to .get_driver_features() doesn't make sense
> > > > > when feature negotiation isn't complete. Neither should
> > > > > present negotiated_features to userspace before negotiation
> > > > > is done.
> > > > > 
> > > > > Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
> > > > > probably should not show before negotiation is done - it
> > > > > depends on driver features negotiated.
> > > > I have another patch in this series introduces device_features
> > > > and will report device_features to the userspace even features
> > > > negotiation not done. Because the spec says we should allow
> > > > driver access the config space before FEATURES_OK.
> > > The config space can be accessed by guest before features_ok doesn't
> > > necessarily mean the value is valid.
> > 
> > 
> > It's valid as long as the device offers the feature:
> > 
> > "The device MUST allow reading of any device-specific configuration
> > field before FEATURES_OK is set by the driver. This includes fields
> > which are conditional on feature bits, as long as those feature bits are
> > offered by the device."
> I guess this statement only conveys that the field in config space can be
> read before FEATURES_OK is set, though it does not *explicitly* states the
> validity of field.
> 
> And looking at:
> 
> "The mac address field always exists (though is only valid if
> VIRTIO_NET_F_MAC is set), and status only exists if VIRTIO_NET_F_STATUS is
> set."
> 
> It appears to me there's a border line set between "exist" and "valid". If I
> understand the spec wording correctly, a spec-conforming device
> implementation may or may not offer valid status value in the config space
> when VIRTIO_NET_F_STATUS is offered, but before the feature is negotiated.
> On the other hand, config space should contain valid mac address the moment
> VIRTIO_NET_F_MAC feature is offered, regardless being negotiated or not. By
> that, there seems to be leeway for the device implementation to decide when
> config space field may become valid, though for most of QEMU's software
> virtio devices, valid value is present to config space the very first moment
> when feature is offered.
> 
> "If the VIRTIO_NET_F_MAC feature bit is set, the configuration space mac
> entry indicates the “physical” address of the network card, otherwise the
> driver would typically generate a random local MAC address."
> "If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link status comes
> from the bottom bit of status. Otherwise, the driver assumes it’s active."
> 
> And also there are special cases where the read of specific configuration
> space field MUST be deferred to until FEATURES_OK is set:
> 
> "If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache mode can be
> read or set through the writeback field. 0 corresponds to a writethrough
> cache, 1 to a writeback cache11. The cache mode after reset can be either
> writeback or writethrough. The actual mode can be determined by reading
> writeback after feature negotiation."
> "The driver MUST NOT read writeback before setting the FEATURES_OK device
> status bit."
> "If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH is not, the
> device MUST initialize writeback to 0."
> 
> Since the spec doesn't explicitly mandate the validity of each config space
> field when feature of concern is offered, to be safer we'd have to live with
> odd device implementation. I know for sure QEMU software devices won't for
> 99% of these cases, but that's not what is currently defined in the spec.


Thanks for raising this subject. I started working on this in April:

https://lists.oasis-open.org/archives/virtio-comment/202201/msg00068.html

working now to address the comments.


> > 
> > 
> > > You may want to double check with Michael for what he quoted earlier:
> > > > Nope:
> > > > 
> > > > 2.5.1  Driver Requirements: Device Configuration Space
> > > > 
> > > > ...
> > > > 
> > > > For optional configuration space fields, the driver MUST check
> > > > that the corresponding feature is offered
> > > > before accessing that part of the configuration space.
> > > 
> > > and how many driver bugs taking wrong assumption of the validity of
> > > config space field without features_ok. I am not sure what use case
> > > you want to expose config resister values for before features_ok, if
> > > it's mostly for live migration I guess it's probably heading a wrong
> > > direction.
> > 
> > 
> > I guess it's not for migration.
> Then what's the other possible use case than live migration, were to expose
> config space values? Troubleshooting config space discrepancy between vDPA
> and the emulated virtio device in userspace? Or tracking changes in config
> space across feature negotiation, but for what? It'd be beneficial to the
> interface design if the specific use case can be clearly described...
> 
> 
> > For migration, a provision with the correct features/capability would be
> > sufficient.
> Right, that's what I thought too. It doesn't need to expose config space
> values, simply exporting all attributes for vdpa device creation will do the
> work.
> 
> -Siwei
> 
> > 
> > Thanks
> > 
> > 
> > > 
> > > 
> > > > > 
> > > > > 
> > > > > Last but not the least, this "vdpa dev config" command was
> > > > > not designed to display the real config space register
> > > > > values in the first place. Quoting the vdpa-dev(8) man page:
> > > > > 
> > > > > > vdpa dev config show - Show configuration of specific
> > > > > > device or all devices.
> > > > > > DEV - specifies the vdpa device to show its
> > > > > > configuration. If this argument is omitted all devices
> > > > > > configuration is listed.
> > > > > It doesn't say anything about configuration space or
> > > > > register values in config space. As long as it can convey
> > > > > the config attribute when instantiating vDPA device
> > > > > instance, and more importantly, the config can be easily
> > > > > imported from or exported to userspace tools when trying to
> > > > > reconstruct vdpa instance intact on destination host for
> > > > > live migration, IMHO in my personal interpretation it
> > > > > doesn't matter what the config space may present. It may be
> > > > > worth while adding a new debug command to expose the real
> > > > > register value, but that's another story.
> > > > I am not sure getting your points. vDPA now reports device
> > > > feature bits(device_features) and negotiated feature
> > > > bits(driver_features), and yes, the drivers features can be a
> > > > subset of the device features; and the vDPA device features can
> > > > be a subset of the management device features.
> > > What I said is after unblocking the conditional check, you'd have to
> > > handle the case for each of the vdpa attribute when feature
> > > negotiation is not yet done: basically the register values you got
> > > from config space via the vdpa_get_config_unlocked() call is not
> > > considered to be valid before features_ok (per-spec). Although in
> > > some case you may get sane value, such behavior is generally
> > > undefined. If you desire to show just the device_features alone
> > > without any config space field, which the device had advertised
> > > *before feature negotiation is complete*, that'll be fine. But looks
> > > to me this is not how patch has been implemented. Probably need some
> > > more work?
> > > 
> > > Regards,
> > > -Siwei
> > > 
> > > > > 
> > > > > Having said, please consider to drop the Fixes tag, as
> > > > > appears to me you're proposing a new feature rather than
> > > > > fixing a real issue.
> > > > it's a new feature to report the device feature bits than only
> > > > negotiated features, however this patch is a must, or it will
> > > > block the device feature bits reporting. but I agree, the fix
> > > > tag is not a must.
> > > > > 
> > > > > Thanks,
> > > > > -Siwei
> > > > > 
> > > > > On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
> > > > > > > From: Zhu Lingshan<lingshan.zhu@intel.com>
> > > > > > > Sent: Friday, July 1, 2022 9:28 AM
> > > > > > > 
> > > > > > > Users may want to query the config space of a vDPA
> > > > > > > device, to choose a
> > > > > > > appropriate one for a certain guest. This means the
> > > > > > > users need to read the
> > > > > > > config space before FEATURES_OK, and the existence of config space
> > > > > > > contents does not depend on FEATURES_OK.
> > > > > > > 
> > > > > > > The spec says:
> > > > > > > The device MUST allow reading of any device-specific
> > > > > > > configuration field
> > > > > > > before FEATURES_OK is set by the driver. This
> > > > > > > includes fields which are
> > > > > > > conditional on feature bits, as long as those
> > > > > > > feature bits are offered by the
> > > > > > > device.
> > > > > > > 
> > > > > > > Fixes: 30ef7a8ac8a07 (vdpa: Read device
> > > > > > > configuration only if FEATURES_OK)
> > > > > > Fix is fine, but fixes tag needs correction described below.
> > > > > > 
> > > > > > Above commit id is 13 letters should be 12.
> > > > > > And
> > > > > > It should be in format
> > > > > > Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration
> > > > > > only if FEATURES_OK")
> > > > > > 
> > > > > > Please use checkpatch.pl script before posting the
> > > > > > patches to catch these errors.
> > > > > > There is a bot that looks at the fixes tag and
> > > > > > identifies the right kernel version to apply this fix.
> > > > > > 
> > > > > > > Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
> > > > > > > ---
> > > > > > >   drivers/vdpa/vdpa.c | 8 --------
> > > > > > >   1 file changed, 8 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> > > > > > > 9b0e39b2f022..d76b22b2f7ae 100644
> > > > > > > --- a/drivers/vdpa/vdpa.c
> > > > > > > +++ b/drivers/vdpa/vdpa.c
> > > > > > > @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
> > > > > > > struct sk_buff *msg, u32 portid,  {
> > > > > > >       u32 device_id;
> > > > > > >       void *hdr;
> > > > > > > -    u8 status;
> > > > > > >       int err;
> > > > > > > 
> > > > > > >       down_read(&vdev->cf_lock);
> > > > > > > -    status = vdev->config->get_status(vdev);
> > > > > > > -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
> > > > > > > -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
> > > > > > > completed");
> > > > > > > -        err = -EAGAIN;
> > > > > > > -        goto out;
> > > > > > > -    }
> > > > > > > -
> > > > > > >       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
> > > > > > >                 VDPA_CMD_DEV_CONFIG_GET);
> > > > > > >       if (!hdr) {
> > > > > > > -- 
> > > > > > > 2.31.1
> > > > > > _______________________________________________
> > > > > > Virtualization mailing list
> > > > > > Virtualization@lists.linux-foundation.org
> > > > > > https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!Pkwym7OAjoDucUqs2fAwchxqL8-BGd6wOl-51xcgB_yCNwPJ_cs8A1y-cYmrLTB4OBNsimnZuqJPcvQIl3g$
> > > > > 
> > > > > 
> > > > 
> > > 
> > 


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-28  2:44                                         ` Zhu, Lingshan
@ 2022-07-28 21:54                                           ` Si-Wei Liu
  2022-07-29  2:07                                             ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Si-Wei Liu @ 2022-07-28 21:54 UTC (permalink / raw)
  To: Zhu, Lingshan, Michael S. Tsirkin
  Cc: Parav Pandit, netdev, virtualization, xieyongji, gautam.dawar



On 7/27/2022 7:44 PM, Zhu, Lingshan wrote:
>
>
> On 7/28/2022 9:41 AM, Si-Wei Liu wrote:
>>
>>
>> On 7/27/2022 4:54 AM, Zhu, Lingshan wrote:
>>>
>>>
>>> On 7/27/2022 6:09 PM, Si-Wei Liu wrote:
>>>>
>>>>
>>>> On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
>>>>> On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
>>>>>>
>>>>>> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
>>>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>>>>
>>>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>>>>
>>>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>>>>> When the user space which invokes netlink commands, 
>>>>>>>>>>>>>> detects that
>>>>>>>>>>> _MQ
>>>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by 
>>>>>>>>>>>>> itself.
>>>>>>>>>>>>> I think the kernel module have all necessary information 
>>>>>>>>>>>>> and it is
>>>>>>>>>>>>> the only one which have precise information of a device, 
>>>>>>>>>>>>> so it
>>>>>>>>>>>>> should answer precisely than let the user space guess. The 
>>>>>>>>>>>>> kernel
>>>>>>>>>>>>> module should be reliable than stay silent, leave the 
>>>>>>>>>>>>> question to
>>>>>>>>>>>>> the user space
>>>>>>>>>>> tool.
>>>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field 
>>>>>>>>>>>> if the
>>>>>>>>>>>> field doesn’t
>>>>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>>>>> so when you know it is one queue pair, you should answer 
>>>>>>>>>>> one, not try
>>>>>>>>>>> to guess.
>>>>>>>>>>>> User space should not guess either. User space gets to see 
>>>>>>>>>>>> if _MQ
>>>>>>>>>>> present/not present. If _MQ present than get reliable data 
>>>>>>>>>>> from kernel.
>>>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>>>>> it is still a guess, right? And all user space tools 
>>>>>>>>>>> implemented this
>>>>>>>>>>> feature need to guess
>>>>>>>>>> No. it is not a guess.
>>>>>>>>>> It is explicitly checking the _MQ feature and deriving the 
>>>>>>>>>> value.
>>>>>>>>>> The code you proposed will be present in the user space.
>>>>>>>>>> It will be uniform for _MQ and 10 other features that are 
>>>>>>>>>> present now and
>>>>>>>>> in the future.
>>>>>>>>> MQ and other features like RSS are different. If there is no 
>>>>>>>>> _RSS_XX, there
>>>>>>>>> are no attributes like max_rss_key_size, and there is not a 
>>>>>>>>> default value.
>>>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>>>>> "we" = user space.
>>>>>>>> To keep the consistency among all the config space fields.
>>>>>>> Actually I looked and the code some more and I'm puzzled:
>>>>>>>
>>>>>>>
>>>>>>>     struct virtio_net_config config = {};
>>>>>>>     u64 features;
>>>>>>>     u16 val_u16;
>>>>>>>
>>>>>>>     vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>>>
>>>>>>>     if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, 
>>>>>>> sizeof(config.mac),
>>>>>>>             config.mac))
>>>>>>>         return -EMSGSIZE;
>>>>>>>
>>>>>>>
>>>>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>>>>
>>>>>>>
>>>>>>>     val_u16 = le16_to_cpu(config.status);
>>>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>>         return -EMSGSIZE;
>>>>>>>
>>>>>>>
>>>>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>>>>
>>>>>>>     val_u16 = le16_to_cpu(config.mtu);
>>>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>>         return -EMSGSIZE;
>>>>>>>
>>>>>>>
>>>>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>>>>
>>>>>>>
>>>>>>> What's going on here?
>>>>>>>
>>>>>>>
>>>>>> I guess this is spec thing (historical debt), I vaguely recall 
>>>>>> these fields
>>>>>> are always present in config space regardless the existence of 
>>>>>> corresponding
>>>>>> feature bit.
>>>>>>
>>>>>> -Siwei
>>>>> Nope:
>>>>>
>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>
>>>>> ...
>>>>>
>>>>> For optional configuration space fields, the driver MUST check 
>>>>> that the corresponding feature is offered
>>>>> before accessing that part of the configuration space.
>>>> Well, this is driver side of requirement. As this interface is for 
>>>> host admin tool to query or configure vdpa device, we don't have to 
>>>> wait until feature negotiation is done on guest driver to extract 
>>>> vdpa attributes/parameters, say if we want to replicate another 
>>>> vdpa device with the same config on migration destination. I think 
>>>> what may need to be fix is to move off from using 
>>>> .vdpa_get_config_unlocked() which depends on feature negotiation. 
>>>> And/or expose config space register values through another set of 
>>>> attributes.
>>> Yes, we don't have to wait for FEATURES_OK. In another patch in this 
>>> series, I have added a new netlink attr to report the device 
>>> features, and removed the blocker. So the LM orchestration SW can 
>>> query the device features of the devices at the destination cluster, 
>>> and pick a proper one, even mask out some features to meet the LM 
>>> requirements.
>> For that end, you'd need to move off from using 
>> vdpa_get_config_unlocked() which depends on feature negotiation. 
>> Since this would slightly change the original semantics of each field 
>> that "vdpa dev config" shows, it probably need another netlink 
>> command and new uAPI.
> why not show both device_features and driver_features in "vdpa dev 
> config show"?
>
As I requested in the other email, I'd like to see the proposed 'vdpa 
dev config ...' example output for various phases in feature 
negotiation, and the specific use case (motivation) for this proposed 
output. I am having difficulty to match what you want to do with the 
patch posted.

-Siwei

>>
>> -Siwei
>>
>>
>>>
>>> Thanks,
>>> Zhu Lingshan
>>>> -Siwei
>>>>
>>>>
>>>>
>>>>
>>>
>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-07-28 11:35               ` [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space Michael S. Tsirkin
@ 2022-07-28 22:12                 ` Si-Wei Liu
  0 siblings, 0 replies; 113+ messages in thread
From: Si-Wei Liu @ 2022-07-28 22:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Zhu, Lingshan, Parav Pandit, Eli Cohen, netdev,
	xieyongji, gautam.dawar, virtualization



On 7/28/2022 4:35 AM, Michael S. Tsirkin wrote:
> On Thu, Jul 28, 2022 at 12:08:53AM -0700, Si-Wei Liu wrote:
>>
>> On 7/27/2022 7:06 PM, Jason Wang wrote:
>>> 在 2022/7/28 08:56, Si-Wei Liu 写道:
>>>>
>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>
>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>> Sorry to chime in late in the game. For some reason I
>>>>>> couldn't get to most emails for this discussion (I only
>>>>>> subscribed to the virtualization list), while I was taking
>>>>>> off amongst the past few weeks.
>>>>>>
>>>>>> It looks to me this patch is incomplete. Noted down the way
>>>>>> in vdpa_dev_net_config_fill(), we have the following:
>>>>>>           features = vdev->config->get_driver_features(vdev);
>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>                                 VDPA_ATTR_PAD))
>>>>>>                   return -EMSGSIZE;
>>>>>>
>>>>>> Making call to .get_driver_features() doesn't make sense
>>>>>> when feature negotiation isn't complete. Neither should
>>>>>> present negotiated_features to userspace before negotiation
>>>>>> is done.
>>>>>>
>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
>>>>>> probably should not show before negotiation is done - it
>>>>>> depends on driver features negotiated.
>>>>> I have another patch in this series introduces device_features
>>>>> and will report device_features to the userspace even features
>>>>> negotiation not done. Because the spec says we should allow
>>>>> driver access the config space before FEATURES_OK.
>>>> The config space can be accessed by guest before features_ok doesn't
>>>> necessarily mean the value is valid.
>>>
>>> It's valid as long as the device offers the feature:
>>>
>>> "The device MUST allow reading of any device-specific configuration
>>> field before FEATURES_OK is set by the driver. This includes fields
>>> which are conditional on feature bits, as long as those feature bits are
>>> offered by the device."
>> I guess this statement only conveys that the field in config space can be
>> read before FEATURES_OK is set, though it does not *explicitly* states the
>> validity of field.
>>
>> And looking at:
>>
>> "The mac address field always exists (though is only valid if
>> VIRTIO_NET_F_MAC is set), and status only exists if VIRTIO_NET_F_STATUS is
>> set."
>>
>> It appears to me there's a border line set between "exist" and "valid". If I
>> understand the spec wording correctly, a spec-conforming device
>> implementation may or may not offer valid status value in the config space
>> when VIRTIO_NET_F_STATUS is offered, but before the feature is negotiated.
>> On the other hand, config space should contain valid mac address the moment
>> VIRTIO_NET_F_MAC feature is offered, regardless being negotiated or not. By
>> that, there seems to be leeway for the device implementation to decide when
>> config space field may become valid, though for most of QEMU's software
>> virtio devices, valid value is present to config space the very first moment
>> when feature is offered.
>>
>> "If the VIRTIO_NET_F_MAC feature bit is set, the configuration space mac
>> entry indicates the “physical” address of the network card, otherwise the
>> driver would typically generate a random local MAC address."
>> "If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link status comes
>> from the bottom bit of status. Otherwise, the driver assumes it’s active."
>>
>> And also there are special cases where the read of specific configuration
>> space field MUST be deferred to until FEATURES_OK is set:
>>
>> "If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache mode can be
>> read or set through the writeback field. 0 corresponds to a writethrough
>> cache, 1 to a writeback cache11. The cache mode after reset can be either
>> writeback or writethrough. The actual mode can be determined by reading
>> writeback after feature negotiation."
>> "The driver MUST NOT read writeback before setting the FEATURES_OK device
>> status bit."
>> "If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH is not, the
>> device MUST initialize writeback to 0."
>>
>> Since the spec doesn't explicitly mandate the validity of each config space
>> field when feature of concern is offered, to be safer we'd have to live with
>> odd device implementation. I know for sure QEMU software devices won't for
>> 99% of these cases, but that's not what is currently defined in the spec.
>
> Thanks for raising this subject. I started working on this in April:
>
> https://urldefense.com/v3/__https://lists.oasis-open.org/archives/virtio-comment/202201/msg00068.html__;!!ACWV5N9M2RV99hQ!Os6QE_RJokx7Us9y7-5-ByVVLuy3oLuPodAdZWxwJw_aNkJY0o0H7691FI9aYSTRLVieASUD_eOu$
>
> working now to address the comments.
Great, thank you very much!

Is there a link to the latest draft that reflects the change uptodate? 
The one above with iterative feature negotiation seemed getting some 
resistance because of backward incompatibility with older spec? Please 
copy me in the loop when you have the draft ready. I am not in the 
virtio-comment list.

Thanks,
-Siwei
>
>
>>>
>>>> You may want to double check with Michael for what he quoted earlier:
>>>>> Nope:
>>>>>
>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>
>>>>> ...
>>>>>
>>>>> For optional configuration space fields, the driver MUST check
>>>>> that the corresponding feature is offered
>>>>> before accessing that part of the configuration space.
>>>> and how many driver bugs taking wrong assumption of the validity of
>>>> config space field without features_ok. I am not sure what use case
>>>> you want to expose config resister values for before features_ok, if
>>>> it's mostly for live migration I guess it's probably heading a wrong
>>>> direction.
>>>
>>> I guess it's not for migration.
>> Then what's the other possible use case than live migration, were to expose
>> config space values? Troubleshooting config space discrepancy between vDPA
>> and the emulated virtio device in userspace? Or tracking changes in config
>> space across feature negotiation, but for what? It'd be beneficial to the
>> interface design if the specific use case can be clearly described...
>>
>>
>>> For migration, a provision with the correct features/capability would be
>>> sufficient.
>> Right, that's what I thought too. It doesn't need to expose config space
>> values, simply exporting all attributes for vdpa device creation will do the
>> work.
>>
>> -Siwei
>>
>>> Thanks
>>>
>>>
>>>>
>>>>>>
>>>>>> Last but not the least, this "vdpa dev config" command was
>>>>>> not designed to display the real config space register
>>>>>> values in the first place. Quoting the vdpa-dev(8) man page:
>>>>>>
>>>>>>> vdpa dev config show - Show configuration of specific
>>>>>>> device or all devices.
>>>>>>> DEV - specifies the vdpa device to show its
>>>>>>> configuration. If this argument is omitted all devices
>>>>>>> configuration is listed.
>>>>>> It doesn't say anything about configuration space or
>>>>>> register values in config space. As long as it can convey
>>>>>> the config attribute when instantiating vDPA device
>>>>>> instance, and more importantly, the config can be easily
>>>>>> imported from or exported to userspace tools when trying to
>>>>>> reconstruct vdpa instance intact on destination host for
>>>>>> live migration, IMHO in my personal interpretation it
>>>>>> doesn't matter what the config space may present. It may be
>>>>>> worth while adding a new debug command to expose the real
>>>>>> register value, but that's another story.
>>>>> I am not sure getting your points. vDPA now reports device
>>>>> feature bits(device_features) and negotiated feature
>>>>> bits(driver_features), and yes, the drivers features can be a
>>>>> subset of the device features; and the vDPA device features can
>>>>> be a subset of the management device features.
>>>> What I said is after unblocking the conditional check, you'd have to
>>>> handle the case for each of the vdpa attribute when feature
>>>> negotiation is not yet done: basically the register values you got
>>>> from config space via the vdpa_get_config_unlocked() call is not
>>>> considered to be valid before features_ok (per-spec). Although in
>>>> some case you may get sane value, such behavior is generally
>>>> undefined. If you desire to show just the device_features alone
>>>> without any config space field, which the device had advertised
>>>> *before feature negotiation is complete*, that'll be fine. But looks
>>>> to me this is not how patch has been implemented. Probably need some
>>>> more work?
>>>>
>>>> Regards,
>>>> -Siwei
>>>>
>>>>>> Having said, please consider to drop the Fixes tag, as
>>>>>> appears to me you're proposing a new feature rather than
>>>>>> fixing a real issue.
>>>>> it's a new feature to report the device feature bits than only
>>>>> negotiated features, however this patch is a must, or it will
>>>>> block the device feature bits reporting. but I agree, the fix
>>>>> tag is not a must.
>>>>>> Thanks,
>>>>>> -Siwei
>>>>>>
>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>
>>>>>>>> Users may want to query the config space of a vDPA
>>>>>>>> device, to choose a
>>>>>>>> appropriate one for a certain guest. This means the
>>>>>>>> users need to read the
>>>>>>>> config space before FEATURES_OK, and the existence of config space
>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>
>>>>>>>> The spec says:
>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>> configuration field
>>>>>>>> before FEATURES_OK is set by the driver. This
>>>>>>>> includes fields which are
>>>>>>>> conditional on feature bits, as long as those
>>>>>>>> feature bits are offered by the
>>>>>>>> device.
>>>>>>>>
>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device
>>>>>>>> configuration only if FEATURES_OK)
>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>
>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>> And
>>>>>>> It should be in format
>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration
>>>>>>> only if FEATURES_OK")
>>>>>>>
>>>>>>> Please use checkpatch.pl script before posting the
>>>>>>> patches to catch these errors.
>>>>>>> There is a bot that looks at the fixes tag and
>>>>>>> identifies the right kernel version to apply this fix.
>>>>>>>
>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>> ---
>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>        u32 device_id;
>>>>>>>>        void *hdr;
>>>>>>>> -    u8 status;
>>>>>>>>        int err;
>>>>>>>>
>>>>>>>>        down_read(&vdev->cf_lock);
>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>> completed");
>>>>>>>> -        err = -EAGAIN;
>>>>>>>> -        goto out;
>>>>>>>> -    }
>>>>>>>> -
>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>>>>>>                  VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>        if (!hdr) {
>>>>>>>> -- 
>>>>>>>> 2.31.1
>>>>>>> _______________________________________________
>>>>>>> Virtualization mailing list
>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!Pkwym7OAjoDucUqs2fAwchxqL8-BGd6wOl-51xcgB_yCNwPJ_cs8A1y-cYmrLTB4OBNsimnZuqJPcvQIl3g$
>>>>>>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-28 21:54                                           ` Si-Wei Liu
@ 2022-07-29  2:07                                             ` Zhu, Lingshan
  0 siblings, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-29  2:07 UTC (permalink / raw)
  To: Si-Wei Liu, Michael S. Tsirkin
  Cc: Parav Pandit, netdev, virtualization, xieyongji, gautam.dawar



On 7/29/2022 5:54 AM, Si-Wei Liu wrote:
>
>
> On 7/27/2022 7:44 PM, Zhu, Lingshan wrote:
>>
>>
>> On 7/28/2022 9:41 AM, Si-Wei Liu wrote:
>>>
>>>
>>> On 7/27/2022 4:54 AM, Zhu, Lingshan wrote:
>>>>
>>>>
>>>> On 7/27/2022 6:09 PM, Si-Wei Liu wrote:
>>>>>
>>>>>
>>>>> On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
>>>>>> On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
>>>>>>>
>>>>>>> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
>>>>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>>>>>
>>>>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>>>>>> When the user space which invokes netlink commands, 
>>>>>>>>>>>>>>> detects that
>>>>>>>>>>>> _MQ
>>>>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by 
>>>>>>>>>>>>>> itself.
>>>>>>>>>>>>>> I think the kernel module have all necessary information 
>>>>>>>>>>>>>> and it is
>>>>>>>>>>>>>> the only one which have precise information of a device, 
>>>>>>>>>>>>>> so it
>>>>>>>>>>>>>> should answer precisely than let the user space guess. 
>>>>>>>>>>>>>> The kernel
>>>>>>>>>>>>>> module should be reliable than stay silent, leave the 
>>>>>>>>>>>>>> question to
>>>>>>>>>>>>>> the user space
>>>>>>>>>>>> tool.
>>>>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field 
>>>>>>>>>>>>> if the
>>>>>>>>>>>>> field doesn’t
>>>>>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>>>>>> so when you know it is one queue pair, you should answer 
>>>>>>>>>>>> one, not try
>>>>>>>>>>>> to guess.
>>>>>>>>>>>>> User space should not guess either. User space gets to see 
>>>>>>>>>>>>> if _MQ
>>>>>>>>>>>> present/not present. If _MQ present than get reliable data 
>>>>>>>>>>>> from kernel.
>>>>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>>>>>> it is still a guess, right? And all user space tools 
>>>>>>>>>>>> implemented this
>>>>>>>>>>>> feature need to guess
>>>>>>>>>>> No. it is not a guess.
>>>>>>>>>>> It is explicitly checking the _MQ feature and deriving the 
>>>>>>>>>>> value.
>>>>>>>>>>> The code you proposed will be present in the user space.
>>>>>>>>>>> It will be uniform for _MQ and 10 other features that are 
>>>>>>>>>>> present now and
>>>>>>>>>> in the future.
>>>>>>>>>> MQ and other features like RSS are different. If there is no 
>>>>>>>>>> _RSS_XX, there
>>>>>>>>>> are no attributes like max_rss_key_size, and there is not a 
>>>>>>>>>> default value.
>>>>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>>>>>> "we" = user space.
>>>>>>>>> To keep the consistency among all the config space fields.
>>>>>>>> Actually I looked and the code some more and I'm puzzled:
>>>>>>>>
>>>>>>>>
>>>>>>>>     struct virtio_net_config config = {};
>>>>>>>>     u64 features;
>>>>>>>>     u16 val_u16;
>>>>>>>>
>>>>>>>>     vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>>>>
>>>>>>>>     if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, 
>>>>>>>> sizeof(config.mac),
>>>>>>>>             config.mac))
>>>>>>>>         return -EMSGSIZE;
>>>>>>>>
>>>>>>>>
>>>>>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>>>>>
>>>>>>>>
>>>>>>>>     val_u16 = le16_to_cpu(config.status);
>>>>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>>>         return -EMSGSIZE;
>>>>>>>>
>>>>>>>>
>>>>>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>>>>>
>>>>>>>>     val_u16 = le16_to_cpu(config.mtu);
>>>>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>>>         return -EMSGSIZE;
>>>>>>>>
>>>>>>>>
>>>>>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>>>>>
>>>>>>>>
>>>>>>>> What's going on here?
>>>>>>>>
>>>>>>>>
>>>>>>> I guess this is spec thing (historical debt), I vaguely recall 
>>>>>>> these fields
>>>>>>> are always present in config space regardless the existence of 
>>>>>>> corresponding
>>>>>>> feature bit.
>>>>>>>
>>>>>>> -Siwei
>>>>>> Nope:
>>>>>>
>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>
>>>>>> ...
>>>>>>
>>>>>> For optional configuration space fields, the driver MUST check 
>>>>>> that the corresponding feature is offered
>>>>>> before accessing that part of the configuration space.
>>>>> Well, this is driver side of requirement. As this interface is for 
>>>>> host admin tool to query or configure vdpa device, we don't have 
>>>>> to wait until feature negotiation is done on guest driver to 
>>>>> extract vdpa attributes/parameters, say if we want to replicate 
>>>>> another vdpa device with the same config on migration destination. 
>>>>> I think what may need to be fix is to move off from using 
>>>>> .vdpa_get_config_unlocked() which depends on feature negotiation. 
>>>>> And/or expose config space register values through another set of 
>>>>> attributes.
>>>> Yes, we don't have to wait for FEATURES_OK. In another patch in 
>>>> this series, I have added a new netlink attr to report the device 
>>>> features, and removed the blocker. So the LM orchestration SW can 
>>>> query the device features of the devices at the destination 
>>>> cluster, and pick a proper one, even mask out some features to meet 
>>>> the LM requirements.
>>> For that end, you'd need to move off from using 
>>> vdpa_get_config_unlocked() which depends on feature negotiation. 
>>> Since this would slightly change the original semantics of each 
>>> field that "vdpa dev config" shows, it probably need another netlink 
>>> command and new uAPI.
>> why not show both device_features and driver_features in "vdpa dev 
>> config show"?
>>
> As I requested in the other email, I'd like to see the proposed 'vdpa 
> dev config ...' example output for various phases in feature 
> negotiation, and the specific use case (motivation) for this proposed 
> output. I am having difficulty to match what you want to do with the 
> patch posted.
The features bits of a device don't depend on the phases, and the 
driver_features only has meaningful values when FEATURES_OK.

Thanks
>
> -Siwei
>
>>>
>>> -Siwei
>>>
>>>
>>>>
>>>> Thanks,
>>>> Zhu Lingshan
>>>>> -Siwei
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-01 13:28 ` [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c Zhu Lingshan
  2022-07-01 22:18   ` Parav Pandit
@ 2022-07-29  8:53   ` Michael S. Tsirkin
  2022-07-29  9:07     ` Zhu, Lingshan
  1 sibling, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-29  8:53 UTC (permalink / raw)
  To: Zhu Lingshan
  Cc: jasowang, virtualization, netdev, parav, xieyongji, gautam.dawar

On Fri, Jul 01, 2022 at 09:28:26PM +0800, Zhu Lingshan wrote:
> This commit fixes spars warnings: cast to restricted __le16
> in function vdpa_dev_net_config_fill() and
> vdpa_fill_stats_rec()
> 
> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> ---
>  drivers/vdpa/vdpa.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index 846dd37f3549..ed49fe46a79e 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
>  		    config.mac))
>  		return -EMSGSIZE;
>  
> -	val_u16 = le16_to_cpu(config.status);
> +	val_u16 = __virtio16_to_cpu(true, config.status);
>  	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>  		return -EMSGSIZE;
>  
> -	val_u16 = le16_to_cpu(config.mtu);
> +	val_u16 = __virtio16_to_cpu(true, config.mtu);
>  	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>  		return -EMSGSIZE;
>  

Wrong on BE platforms with legacy interface, isn't it?
We generally don't handle legacy properly in VDPA so it's
not a huge deal, but maybe add a comment at least?


> @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
>  	}
>  	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>  
> -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
> +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
>  	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
>  		return -EMSGSIZE;
>  
> -- 
> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-29  8:53   ` Michael S. Tsirkin
@ 2022-07-29  9:07     ` Zhu, Lingshan
  2022-07-29  9:17       ` Michael S. Tsirkin
  0 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-29  9:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtualization, netdev, parav, xieyongji, gautam.dawar



On 7/29/2022 4:53 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 01, 2022 at 09:28:26PM +0800, Zhu Lingshan wrote:
>> This commit fixes spars warnings: cast to restricted __le16
>> in function vdpa_dev_net_config_fill() and
>> vdpa_fill_stats_rec()
>>
>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>> ---
>>   drivers/vdpa/vdpa.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>> index 846dd37f3549..ed49fe46a79e 100644
>> --- a/drivers/vdpa/vdpa.c
>> +++ b/drivers/vdpa/vdpa.c
>> @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
>>   		    config.mac))
>>   		return -EMSGSIZE;
>>   
>> -	val_u16 = le16_to_cpu(config.status);
>> +	val_u16 = __virtio16_to_cpu(true, config.status);
>>   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>   		return -EMSGSIZE;
>>   
>> -	val_u16 = le16_to_cpu(config.mtu);
>> +	val_u16 = __virtio16_to_cpu(true, config.mtu);
>>   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>   		return -EMSGSIZE;
>>   
> Wrong on BE platforms with legacy interface, isn't it?
> We generally don't handle legacy properly in VDPA so it's
> not a huge deal, but maybe add a comment at least?
Sure, I can add a comment here: this is for modern devices only.

Thanks,
Zhu Lingshan
>
>
>> @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
>>   	}
>>   	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>   
>> -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
>> +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
>>   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
>>   		return -EMSGSIZE;
>>   
>> -- 
>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-29  9:07     ` Zhu, Lingshan
@ 2022-07-29  9:17       ` Michael S. Tsirkin
  2022-07-29  9:20         ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-29  9:17 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: jasowang, virtualization, netdev, parav, xieyongji, gautam.dawar

On Fri, Jul 29, 2022 at 05:07:11PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 7/29/2022 4:53 PM, Michael S. Tsirkin wrote:
> > On Fri, Jul 01, 2022 at 09:28:26PM +0800, Zhu Lingshan wrote:
> > > This commit fixes spars warnings: cast to restricted __le16
> > > in function vdpa_dev_net_config_fill() and
> > > vdpa_fill_stats_rec()
> > > 
> > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > > ---
> > >   drivers/vdpa/vdpa.c | 6 +++---
> > >   1 file changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > index 846dd37f3549..ed49fe46a79e 100644
> > > --- a/drivers/vdpa/vdpa.c
> > > +++ b/drivers/vdpa/vdpa.c
> > > @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
> > >   		    config.mac))
> > >   		return -EMSGSIZE;
> > > -	val_u16 = le16_to_cpu(config.status);
> > > +	val_u16 = __virtio16_to_cpu(true, config.status);
> > >   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > >   		return -EMSGSIZE;
> > > -	val_u16 = le16_to_cpu(config.mtu);
> > > +	val_u16 = __virtio16_to_cpu(true, config.mtu);
> > >   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > >   		return -EMSGSIZE;
> > Wrong on BE platforms with legacy interface, isn't it?
> > We generally don't handle legacy properly in VDPA so it's
> > not a huge deal, but maybe add a comment at least?
> Sure, I can add a comment here: this is for modern devices only.
> 
> Thanks,
> Zhu Lingshan

Hmm. what "this" is for modern devices only here?

> > 
> > 
> > > @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
> > >   	}
> > >   	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > > -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
> > > +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
> > >   	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
> > >   		return -EMSGSIZE;
> > > -- 
> > > 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-29  9:17       ` Michael S. Tsirkin
@ 2022-07-29  9:20         ` Zhu, Lingshan
  2022-07-29  9:23           ` Michael S. Tsirkin
  0 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-29  9:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtualization, netdev, parav, xieyongji, gautam.dawar



On 7/29/2022 5:17 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 29, 2022 at 05:07:11PM +0800, Zhu, Lingshan wrote:
>>
>> On 7/29/2022 4:53 PM, Michael S. Tsirkin wrote:
>>> On Fri, Jul 01, 2022 at 09:28:26PM +0800, Zhu Lingshan wrote:
>>>> This commit fixes spars warnings: cast to restricted __le16
>>>> in function vdpa_dev_net_config_fill() and
>>>> vdpa_fill_stats_rec()
>>>>
>>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>> ---
>>>>    drivers/vdpa/vdpa.c | 6 +++---
>>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>>>> index 846dd37f3549..ed49fe46a79e 100644
>>>> --- a/drivers/vdpa/vdpa.c
>>>> +++ b/drivers/vdpa/vdpa.c
>>>> @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
>>>>    		    config.mac))
>>>>    		return -EMSGSIZE;
>>>> -	val_u16 = le16_to_cpu(config.status);
>>>> +	val_u16 = __virtio16_to_cpu(true, config.status);
>>>>    	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>    		return -EMSGSIZE;
>>>> -	val_u16 = le16_to_cpu(config.mtu);
>>>> +	val_u16 = __virtio16_to_cpu(true, config.mtu);
>>>>    	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>    		return -EMSGSIZE;
>>> Wrong on BE platforms with legacy interface, isn't it?
>>> We generally don't handle legacy properly in VDPA so it's
>>> not a huge deal, but maybe add a comment at least?
>> Sure, I can add a comment here: this is for modern devices only.
>>
>> Thanks,
>> Zhu Lingshan
> Hmm. what "this" is for modern devices only here?
this cast, for LE modern devices.
>
>>>
>>>> @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
>>>>    	}
>>>>    	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>> -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
>>>> +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
>>>>    	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
>>>>    		return -EMSGSIZE;
>>>> -- 
>>>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-29  9:20         ` Zhu, Lingshan
@ 2022-07-29  9:23           ` Michael S. Tsirkin
  2022-07-29  9:35             ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-29  9:23 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: jasowang, virtualization, netdev, parav, xieyongji, gautam.dawar

On Fri, Jul 29, 2022 at 05:20:17PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 7/29/2022 5:17 PM, Michael S. Tsirkin wrote:
> > On Fri, Jul 29, 2022 at 05:07:11PM +0800, Zhu, Lingshan wrote:
> > > 
> > > On 7/29/2022 4:53 PM, Michael S. Tsirkin wrote:
> > > > On Fri, Jul 01, 2022 at 09:28:26PM +0800, Zhu Lingshan wrote:
> > > > > This commit fixes spars warnings: cast to restricted __le16
> > > > > in function vdpa_dev_net_config_fill() and
> > > > > vdpa_fill_stats_rec()
> > > > > 
> > > > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > > > > ---
> > > > >    drivers/vdpa/vdpa.c | 6 +++---
> > > > >    1 file changed, 3 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > > index 846dd37f3549..ed49fe46a79e 100644
> > > > > --- a/drivers/vdpa/vdpa.c
> > > > > +++ b/drivers/vdpa/vdpa.c
> > > > > @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
> > > > >    		    config.mac))
> > > > >    		return -EMSGSIZE;
> > > > > -	val_u16 = le16_to_cpu(config.status);
> > > > > +	val_u16 = __virtio16_to_cpu(true, config.status);
> > > > >    	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > > > >    		return -EMSGSIZE;
> > > > > -	val_u16 = le16_to_cpu(config.mtu);
> > > > > +	val_u16 = __virtio16_to_cpu(true, config.mtu);
> > > > >    	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > > > >    		return -EMSGSIZE;
> > > > Wrong on BE platforms with legacy interface, isn't it?
> > > > We generally don't handle legacy properly in VDPA so it's
> > > > not a huge deal, but maybe add a comment at least?
> > > Sure, I can add a comment here: this is for modern devices only.
> > > 
> > > Thanks,
> > > Zhu Lingshan
> > Hmm. what "this" is for modern devices only here?
> this cast, for LE modern devices.

I think status existed in legacy for sure, and it's possible that
some legacy devices backported mtu and max_virtqueue_pairs otherwise
we would have these fields as __le not as __virtio, right?

> > 
> > > > 
> > > > > @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
> > > > >    	}
> > > > >    	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > > > > -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
> > > > > +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
> > > > >    	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
> > > > >    		return -EMSGSIZE;
> > > > > -- 
> > > > > 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-29  9:23           ` Michael S. Tsirkin
@ 2022-07-29  9:35             ` Zhu, Lingshan
  2022-07-29  9:39               ` Michael S. Tsirkin
  0 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-29  9:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtualization, netdev, parav, xieyongji, gautam.dawar



On 7/29/2022 5:23 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 29, 2022 at 05:20:17PM +0800, Zhu, Lingshan wrote:
>>
>> On 7/29/2022 5:17 PM, Michael S. Tsirkin wrote:
>>> On Fri, Jul 29, 2022 at 05:07:11PM +0800, Zhu, Lingshan wrote:
>>>> On 7/29/2022 4:53 PM, Michael S. Tsirkin wrote:
>>>>> On Fri, Jul 01, 2022 at 09:28:26PM +0800, Zhu Lingshan wrote:
>>>>>> This commit fixes spars warnings: cast to restricted __le16
>>>>>> in function vdpa_dev_net_config_fill() and
>>>>>> vdpa_fill_stats_rec()
>>>>>>
>>>>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>> ---
>>>>>>     drivers/vdpa/vdpa.c | 6 +++---
>>>>>>     1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>>>>>> index 846dd37f3549..ed49fe46a79e 100644
>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>> @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
>>>>>>     		    config.mac))
>>>>>>     		return -EMSGSIZE;
>>>>>> -	val_u16 = le16_to_cpu(config.status);
>>>>>> +	val_u16 = __virtio16_to_cpu(true, config.status);
>>>>>>     	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>     		return -EMSGSIZE;
>>>>>> -	val_u16 = le16_to_cpu(config.mtu);
>>>>>> +	val_u16 = __virtio16_to_cpu(true, config.mtu);
>>>>>>     	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>     		return -EMSGSIZE;
>>>>> Wrong on BE platforms with legacy interface, isn't it?
>>>>> We generally don't handle legacy properly in VDPA so it's
>>>>> not a huge deal, but maybe add a comment at least?
>>>> Sure, I can add a comment here: this is for modern devices only.
>>>>
>>>> Thanks,
>>>> Zhu Lingshan
>>> Hmm. what "this" is for modern devices only here?
>> this cast, for LE modern devices.
> I think status existed in legacy for sure, and it's possible that
> some legacy devices backported mtu and max_virtqueue_pairs otherwise
> we would have these fields as __le not as __virtio, right?
yes, that's the reason why it is virtio_16 than just le16.

I may find a better solution to detect whether it is LE, or BE without a 
virtio_dev structure.
Check whether vdpa_device->get_device_features() has VIRTIO_F_VERISON_1. 
If the device offers _F_VERSION_1, then it is a LE device,
or it is a BE device, then we use __virtio16_to_cpu(false, config.status).

Does this look good?

>
>>>>>> @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
>>>>>>     	}
>>>>>>     	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>> -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
>>>>>> +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
>>>>>>     	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
>>>>>>     		return -EMSGSIZE;
>>>>>> -- 
>>>>>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-29  9:35             ` Zhu, Lingshan
@ 2022-07-29  9:39               ` Michael S. Tsirkin
  2022-07-29 10:01                 ` Zhu, Lingshan
  2022-08-01  4:33                 ` Jason Wang
  0 siblings, 2 replies; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-29  9:39 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: jasowang, virtualization, netdev, parav, xieyongji, gautam.dawar

On Fri, Jul 29, 2022 at 05:35:09PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 7/29/2022 5:23 PM, Michael S. Tsirkin wrote:
> > On Fri, Jul 29, 2022 at 05:20:17PM +0800, Zhu, Lingshan wrote:
> > > 
> > > On 7/29/2022 5:17 PM, Michael S. Tsirkin wrote:
> > > > On Fri, Jul 29, 2022 at 05:07:11PM +0800, Zhu, Lingshan wrote:
> > > > > On 7/29/2022 4:53 PM, Michael S. Tsirkin wrote:
> > > > > > On Fri, Jul 01, 2022 at 09:28:26PM +0800, Zhu Lingshan wrote:
> > > > > > > This commit fixes spars warnings: cast to restricted __le16
> > > > > > > in function vdpa_dev_net_config_fill() and
> > > > > > > vdpa_fill_stats_rec()
> > > > > > > 
> > > > > > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > > > > > > ---
> > > > > > >     drivers/vdpa/vdpa.c | 6 +++---
> > > > > > >     1 file changed, 3 insertions(+), 3 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > > > > index 846dd37f3549..ed49fe46a79e 100644
> > > > > > > --- a/drivers/vdpa/vdpa.c
> > > > > > > +++ b/drivers/vdpa/vdpa.c
> > > > > > > @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
> > > > > > >     		    config.mac))
> > > > > > >     		return -EMSGSIZE;
> > > > > > > -	val_u16 = le16_to_cpu(config.status);
> > > > > > > +	val_u16 = __virtio16_to_cpu(true, config.status);
> > > > > > >     	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > > > > > >     		return -EMSGSIZE;
> > > > > > > -	val_u16 = le16_to_cpu(config.mtu);
> > > > > > > +	val_u16 = __virtio16_to_cpu(true, config.mtu);
> > > > > > >     	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > > > > > >     		return -EMSGSIZE;
> > > > > > Wrong on BE platforms with legacy interface, isn't it?
> > > > > > We generally don't handle legacy properly in VDPA so it's
> > > > > > not a huge deal, but maybe add a comment at least?
> > > > > Sure, I can add a comment here: this is for modern devices only.
> > > > > 
> > > > > Thanks,
> > > > > Zhu Lingshan
> > > > Hmm. what "this" is for modern devices only here?
> > > this cast, for LE modern devices.
> > I think status existed in legacy for sure, and it's possible that
> > some legacy devices backported mtu and max_virtqueue_pairs otherwise
> > we would have these fields as __le not as __virtio, right?
> yes, that's the reason why it is virtio_16 than just le16.
> 
> I may find a better solution to detect whether it is LE, or BE without a
> virtio_dev structure.
> Check whether vdpa_device->get_device_features() has VIRTIO_F_VERISON_1. If
> the device offers _F_VERSION_1, then it is a LE device,
> or it is a BE device, then we use __virtio16_to_cpu(false, config.status).
> 
> Does this look good?

No since the question is can be a legacy driver with a transitional
device.  I don't have a good idea yet. vhost has VHOST_SET_VRING_ENDIAN
and maybe we need something like this for config as well?

> > 
> > > > > > > @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
> > > > > > >     	}
> > > > > > >     	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > > > > > > -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
> > > > > > > +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
> > > > > > >     	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
> > > > > > >     		return -EMSGSIZE;
> > > > > > > -- 
> > > > > > > 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-29  9:39               ` Michael S. Tsirkin
@ 2022-07-29 10:01                 ` Zhu, Lingshan
  2022-07-29 10:16                   ` Michael S. Tsirkin
  2022-08-01  4:33                 ` Jason Wang
  1 sibling, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-29 10:01 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtualization, netdev, parav, xieyongji, gautam.dawar



On 7/29/2022 5:39 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 29, 2022 at 05:35:09PM +0800, Zhu, Lingshan wrote:
>>
>> On 7/29/2022 5:23 PM, Michael S. Tsirkin wrote:
>>> On Fri, Jul 29, 2022 at 05:20:17PM +0800, Zhu, Lingshan wrote:
>>>> On 7/29/2022 5:17 PM, Michael S. Tsirkin wrote:
>>>>> On Fri, Jul 29, 2022 at 05:07:11PM +0800, Zhu, Lingshan wrote:
>>>>>> On 7/29/2022 4:53 PM, Michael S. Tsirkin wrote:
>>>>>>> On Fri, Jul 01, 2022 at 09:28:26PM +0800, Zhu Lingshan wrote:
>>>>>>>> This commit fixes spars warnings: cast to restricted __le16
>>>>>>>> in function vdpa_dev_net_config_fill() and
>>>>>>>> vdpa_fill_stats_rec()
>>>>>>>>
>>>>>>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>>>> ---
>>>>>>>>      drivers/vdpa/vdpa.c | 6 +++---
>>>>>>>>      1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>>>>>>>> index 846dd37f3549..ed49fe46a79e 100644
>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>> @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
>>>>>>>>      		    config.mac))
>>>>>>>>      		return -EMSGSIZE;
>>>>>>>> -	val_u16 = le16_to_cpu(config.status);
>>>>>>>> +	val_u16 = __virtio16_to_cpu(true, config.status);
>>>>>>>>      	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>>>      		return -EMSGSIZE;
>>>>>>>> -	val_u16 = le16_to_cpu(config.mtu);
>>>>>>>> +	val_u16 = __virtio16_to_cpu(true, config.mtu);
>>>>>>>>      	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>>>      		return -EMSGSIZE;
>>>>>>> Wrong on BE platforms with legacy interface, isn't it?
>>>>>>> We generally don't handle legacy properly in VDPA so it's
>>>>>>> not a huge deal, but maybe add a comment at least?
>>>>>> Sure, I can add a comment here: this is for modern devices only.
>>>>>>
>>>>>> Thanks,
>>>>>> Zhu Lingshan
>>>>> Hmm. what "this" is for modern devices only here?
>>>> this cast, for LE modern devices.
>>> I think status existed in legacy for sure, and it's possible that
>>> some legacy devices backported mtu and max_virtqueue_pairs otherwise
>>> we would have these fields as __le not as __virtio, right?
>> yes, that's the reason why it is virtio_16 than just le16.
>>
>> I may find a better solution to detect whether it is LE, or BE without a
>> virtio_dev structure.
>> Check whether vdpa_device->get_device_features() has VIRTIO_F_VERISON_1. If
>> the device offers _F_VERSION_1, then it is a LE device,
>> or it is a BE device, then we use __virtio16_to_cpu(false, config.status).
>>
>> Does this look good?
> No since the question is can be a legacy driver with a transitional
> device.  I don't have a good idea yet. vhost has VHOST_SET_VRING_ENDIAN
> and maybe we need something like this for config as well?
Is it a little overkill to implementing vdpa_ops.get_endian()?
>
>>>>>>>> @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
>>>>>>>>      	}
>>>>>>>>      	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>>>> -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
>>>>>>>> +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
>>>>>>>>      	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
>>>>>>>>      		return -EMSGSIZE;
>>>>>>>> -- 
>>>>>>>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-29 10:01                 ` Zhu, Lingshan
@ 2022-07-29 10:16                   ` Michael S. Tsirkin
  2022-07-29 10:18                     ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-07-29 10:16 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: jasowang, virtualization, netdev, parav, xieyongji, gautam.dawar

On Fri, Jul 29, 2022 at 06:01:38PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 7/29/2022 5:39 PM, Michael S. Tsirkin wrote:
> > On Fri, Jul 29, 2022 at 05:35:09PM +0800, Zhu, Lingshan wrote:
> > > 
> > > On 7/29/2022 5:23 PM, Michael S. Tsirkin wrote:
> > > > On Fri, Jul 29, 2022 at 05:20:17PM +0800, Zhu, Lingshan wrote:
> > > > > On 7/29/2022 5:17 PM, Michael S. Tsirkin wrote:
> > > > > > On Fri, Jul 29, 2022 at 05:07:11PM +0800, Zhu, Lingshan wrote:
> > > > > > > On 7/29/2022 4:53 PM, Michael S. Tsirkin wrote:
> > > > > > > > On Fri, Jul 01, 2022 at 09:28:26PM +0800, Zhu Lingshan wrote:
> > > > > > > > > This commit fixes spars warnings: cast to restricted __le16
> > > > > > > > > in function vdpa_dev_net_config_fill() and
> > > > > > > > > vdpa_fill_stats_rec()
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > > > > > > > > ---
> > > > > > > > >      drivers/vdpa/vdpa.c | 6 +++---
> > > > > > > > >      1 file changed, 3 insertions(+), 3 deletions(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > > > > > > index 846dd37f3549..ed49fe46a79e 100644
> > > > > > > > > --- a/drivers/vdpa/vdpa.c
> > > > > > > > > +++ b/drivers/vdpa/vdpa.c
> > > > > > > > > @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
> > > > > > > > >      		    config.mac))
> > > > > > > > >      		return -EMSGSIZE;
> > > > > > > > > -	val_u16 = le16_to_cpu(config.status);
> > > > > > > > > +	val_u16 = __virtio16_to_cpu(true, config.status);
> > > > > > > > >      	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > > > > > > > >      		return -EMSGSIZE;
> > > > > > > > > -	val_u16 = le16_to_cpu(config.mtu);
> > > > > > > > > +	val_u16 = __virtio16_to_cpu(true, config.mtu);
> > > > > > > > >      	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > > > > > > > >      		return -EMSGSIZE;
> > > > > > > > Wrong on BE platforms with legacy interface, isn't it?
> > > > > > > > We generally don't handle legacy properly in VDPA so it's
> > > > > > > > not a huge deal, but maybe add a comment at least?
> > > > > > > Sure, I can add a comment here: this is for modern devices only.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Zhu Lingshan
> > > > > > Hmm. what "this" is for modern devices only here?
> > > > > this cast, for LE modern devices.
> > > > I think status existed in legacy for sure, and it's possible that
> > > > some legacy devices backported mtu and max_virtqueue_pairs otherwise
> > > > we would have these fields as __le not as __virtio, right?
> > > yes, that's the reason why it is virtio_16 than just le16.
> > > 
> > > I may find a better solution to detect whether it is LE, or BE without a
> > > virtio_dev structure.
> > > Check whether vdpa_device->get_device_features() has VIRTIO_F_VERISON_1. If
> > > the device offers _F_VERSION_1, then it is a LE device,
> > > or it is a BE device, then we use __virtio16_to_cpu(false, config.status).
> > > 
> > > Does this look good?
> > No since the question is can be a legacy driver with a transitional
> > device.  I don't have a good idea yet. vhost has VHOST_SET_VRING_ENDIAN
> > and maybe we need something like this for config as well?
> Is it a little overkill to implementing vdpa_ops.get_endian()?

I think the question is driver endian-ness.

But another approach is really just to say userspace should
tweak config endian itself.  Let's just say that in the comment?
/*
 * Assume little endian for now, userspace can tweak this for
 * legacy guest support.
 */
?

> > 
> > > > > > > > > @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
> > > > > > > > >      	}
> > > > > > > > >      	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > > > > > > > > -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
> > > > > > > > > +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
> > > > > > > > >      	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
> > > > > > > > >      		return -EMSGSIZE;
> > > > > > > > > -- 
> > > > > > > > > 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-29 10:16                   ` Michael S. Tsirkin
@ 2022-07-29 10:18                     ` Zhu, Lingshan
  0 siblings, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-07-29 10:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtualization, netdev, parav, xieyongji, gautam.dawar



On 7/29/2022 6:16 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 29, 2022 at 06:01:38PM +0800, Zhu, Lingshan wrote:
>>
>> On 7/29/2022 5:39 PM, Michael S. Tsirkin wrote:
>>> On Fri, Jul 29, 2022 at 05:35:09PM +0800, Zhu, Lingshan wrote:
>>>> On 7/29/2022 5:23 PM, Michael S. Tsirkin wrote:
>>>>> On Fri, Jul 29, 2022 at 05:20:17PM +0800, Zhu, Lingshan wrote:
>>>>>> On 7/29/2022 5:17 PM, Michael S. Tsirkin wrote:
>>>>>>> On Fri, Jul 29, 2022 at 05:07:11PM +0800, Zhu, Lingshan wrote:
>>>>>>>> On 7/29/2022 4:53 PM, Michael S. Tsirkin wrote:
>>>>>>>>> On Fri, Jul 01, 2022 at 09:28:26PM +0800, Zhu Lingshan wrote:
>>>>>>>>>> This commit fixes spars warnings: cast to restricted __le16
>>>>>>>>>> in function vdpa_dev_net_config_fill() and
>>>>>>>>>> vdpa_fill_stats_rec()
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>> ---
>>>>>>>>>>       drivers/vdpa/vdpa.c | 6 +++---
>>>>>>>>>>       1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>>>>>>>>>> index 846dd37f3549..ed49fe46a79e 100644
>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>> @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
>>>>>>>>>>       		    config.mac))
>>>>>>>>>>       		return -EMSGSIZE;
>>>>>>>>>> -	val_u16 = le16_to_cpu(config.status);
>>>>>>>>>> +	val_u16 = __virtio16_to_cpu(true, config.status);
>>>>>>>>>>       	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>>>>>       		return -EMSGSIZE;
>>>>>>>>>> -	val_u16 = le16_to_cpu(config.mtu);
>>>>>>>>>> +	val_u16 = __virtio16_to_cpu(true, config.mtu);
>>>>>>>>>>       	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>>>>>       		return -EMSGSIZE;
>>>>>>>>> Wrong on BE platforms with legacy interface, isn't it?
>>>>>>>>> We generally don't handle legacy properly in VDPA so it's
>>>>>>>>> not a huge deal, but maybe add a comment at least?
>>>>>>>> Sure, I can add a comment here: this is for modern devices only.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Zhu Lingshan
>>>>>>> Hmm. what "this" is for modern devices only here?
>>>>>> this cast, for LE modern devices.
>>>>> I think status existed in legacy for sure, and it's possible that
>>>>> some legacy devices backported mtu and max_virtqueue_pairs otherwise
>>>>> we would have these fields as __le not as __virtio, right?
>>>> yes, that's the reason why it is virtio_16 than just le16.
>>>>
>>>> I may find a better solution to detect whether it is LE, or BE without a
>>>> virtio_dev structure.
>>>> Check whether vdpa_device->get_device_features() has VIRTIO_F_VERISON_1. If
>>>> the device offers _F_VERSION_1, then it is a LE device,
>>>> or it is a BE device, then we use __virtio16_to_cpu(false, config.status).
>>>>
>>>> Does this look good?
>>> No since the question is can be a legacy driver with a transitional
>>> device.  I don't have a good idea yet. vhost has VHOST_SET_VRING_ENDIAN
>>> and maybe we need something like this for config as well?
>> Is it a little overkill to implementing vdpa_ops.get_endian()?
> I think the question is driver endian-ness.
>
> But another approach is really just to say userspace should
> tweak config endian itself.  Let's just say that in the comment?
> /*
>   * Assume little endian for now, userspace can tweak this for
>   * legacy guest support.
>   */
> ?
Oh, yes, the user space can tweak the value!!!

I will add this comment, thanks!!!!
>>>>>>>>>> @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
>>>>>>>>>>       	}
>>>>>>>>>>       	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>>>>>> -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
>>>>>>>>>> +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
>>>>>>>>>>       	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
>>>>>>>>>>       		return -EMSGSIZE;
>>>>>>>>>> -- 
>>>>>>>>>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
       [not found]               ` <c143e2da-208e-b046-9b8f-1780f75ed3e6@intel.com>
@ 2022-07-29 20:55                 ` Si-Wei Liu
  2022-08-01  4:44                   ` Jason Wang
  0 siblings, 1 reply; 113+ messages in thread
From: Si-Wei Liu @ 2022-07-29 20:55 UTC (permalink / raw)
  To: Zhu, Lingshan, Parav Pandit, jasowang, mst, Eli Cohen
  Cc: netdev, xieyongji, gautam.dawar, virtualization



On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>
>
> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>
>>
>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>
>>>
>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>
>>>>
>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>
>>>>>
>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>> Sorry to chime in late in the game. For some reason I couldn't 
>>>>>> get to most emails for this discussion (I only subscribed to the 
>>>>>> virtualization list), while I was taking off amongst the past few 
>>>>>> weeks.
>>>>>>
>>>>>> It looks to me this patch is incomplete. Noted down the way in 
>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>          features = vdev->config->get_driver_features(vdev);
>>>>>>          if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>                                VDPA_ATTR_PAD))
>>>>>>                  return -EMSGSIZE;
>>>>>>
>>>>>> Making call to .get_driver_features() doesn't make sense when 
>>>>>> feature negotiation isn't complete. Neither should present 
>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>
>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() probably 
>>>>>> should not show before negotiation is done - it depends on driver 
>>>>>> features negotiated.
>>>>> I have another patch in this series introduces device_features and 
>>>>> will report device_features to the userspace even features 
>>>>> negotiation not done. Because the spec says we should allow driver 
>>>>> access the config space before FEATURES_OK.
>>>> The config space can be accessed by guest before features_ok 
>>>> doesn't necessarily mean the value is valid. You may want to double 
>>>> check with Michael for what he quoted earlier:
>>> that's why I proposed to fix these issues, e.g., if no _F_MAC, vDPA 
>>> kernel should not return a mac to the userspace, there is not a 
>>> default value for mac.
>> Then please show us the code, as I can only comment based on your 
>> latest (v4) patch and it was not there.. To be honest, I don't 
>> understand the motivation and the use cases you have, is it for 
>> debugging/monitoring or there's really a use case for live migration? 
>> For the former, you can do a direct dump on all config space fields 
>> regardless of endianess and feature negotiation without having to 
>> worry about validity (meaningful to present to admin user). To me 
>> these are conflict asks that is impossible to mix in exact one command.
> This bug just has been revealed two days, and you will see the patch soon.
>
> There are something to clarify:
> 1) we need to read the device features, or how can you pick a proper 
> LM destination
> 2) vdpa dev config show can show both device features and driver 
> features, there just need a patch for iproute2
> 3) To process information like MQ, we don't just dump the config 
> space, MST has explained before
So, it's for live migration... Then why not export those config 
parameters specified for vdpa creation (as well as device feature bits) 
to the output of "vdpa dev show" command? That's where device side 
config lives and is static across vdpa's life cycle. "vdpa dev config 
show" is mostly for dynamic driver side config, and the validity is 
subject to feature negotiation. I suppose this should suit your need of 
LM, e.g.

$ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 2000
$ vdpa dev show vdpa1
vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs 15 
max_vq_size 256
   max_vqp 7 mtu 2000
   dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ 
MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED

For it to work, you'd want to pass "struct vdpa_dev_set_config" to 
_vdpa_register_device() during registration, and get it saved there in 
"struct vdpa_device". Then in vdpa_dev_fill() show each field 
conditionally subject to "struct vdpa_dev_set_config.mask".

Thanks,
-Siwei
>
> Thanks
> Zhu Lingshan
>>
>>>>> Nope:
>>>>>
>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>
>>>>> ...
>>>>>
>>>>> For optional configuration space fields, the driver MUST check that the corresponding feature is offered
>>>>> before accessing that part of the configuration space.
>>>>
>>>> and how many driver bugs taking wrong assumption of the validity of 
>>>> config space field without features_ok. I am not sure what use case 
>>>> you want to expose config resister values for before features_ok, 
>>>> if it's mostly for live migration I guess it's probably heading a 
>>>> wrong direction.
>>>>
>>>>
>>>>>>
>>>>>>
>>>>>> Last but not the least, this "vdpa dev config" command was not 
>>>>>> designed to display the real config space register values in the 
>>>>>> first place. Quoting the vdpa-dev(8) man page:
>>>>>>
>>>>>>> vdpa dev config show - Show configuration of specific device or 
>>>>>>> all devices.
>>>>>>> DEV - specifies the vdpa device to show its configuration. If 
>>>>>>> this argument is omitted all devices configuration is listed.
>>>>>> It doesn't say anything about configuration space or register 
>>>>>> values in config space. As long as it can convey the config 
>>>>>> attribute when instantiating vDPA device instance, and more 
>>>>>> importantly, the config can be easily imported from or exported 
>>>>>> to userspace tools when trying to reconstruct vdpa instance 
>>>>>> intact on destination host for live migration, IMHO in my 
>>>>>> personal interpretation it doesn't matter what the config space 
>>>>>> may present. It may be worth while adding a new debug command to 
>>>>>> expose the real register value, but that's another story.
>>>>> I am not sure getting your points. vDPA now reports device feature 
>>>>> bits(device_features) and negotiated feature 
>>>>> bits(driver_features), and yes, the drivers features can be a 
>>>>> subset of the device features; and the vDPA device features can be 
>>>>> a subset of the management device features.
>>>> What I said is after unblocking the conditional check, you'd have 
>>>> to handle the case for each of the vdpa attribute when feature 
>>>> negotiation is not yet done: basically the register values you got 
>>>> from config space via the vdpa_get_config_unlocked() call is not 
>>>> considered to be valid before features_ok (per-spec). Although in 
>>>> some case you may get sane value, such behavior is generally 
>>>> undefined. If you desire to show just the device_features alone 
>>>> without any config space field, which the device had advertised 
>>>> *before feature negotiation is complete*, that'll be fine. But 
>>>> looks to me this is not how patch has been implemented. Probably 
>>>> need some more work?
>>> They are driver_features(negotiated) and the device_features(which 
>>> comes with the device), and the config space fields that depend on 
>>> them. In this series, we report both to the userspace.
>> I fail to understand what you want to present from your description. 
>> May be worth showing some example outputs that at least include the 
>> following cases: 1) when device offers features but not yet 
>> acknowledge by guest 2) when guest acknowledged features and device 
>> is yet to accept 3) after guest feature negotiation is completed 
>> (agreed upon between guest and device).
> Only two feature sets: 1) what the device has. (2) what is negotiated
>>
>> Thanks,
>> -Siwei
>>>>
>>>> Regards,
>>>> -Siwei
>>>>
>>>>>>
>>>>>> Having said, please consider to drop the Fixes tag, as appears to 
>>>>>> me you're proposing a new feature rather than fixing a real issue.
>>>>> it's a new feature to report the device feature bits than only 
>>>>> negotiated features, however this patch is a must, or it will 
>>>>> block the device feature bits reporting. but I agree, the fix tag 
>>>>> is not a must.
>>>>>>
>>>>>> Thanks,
>>>>>> -Siwei
>>>>>>
>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>
>>>>>>>> Users may want to query the config space of a vDPA device, to choose a
>>>>>>>> appropriate one for a certain guest. This means the users need to read the
>>>>>>>> config space before FEATURES_OK, and the existence of config space
>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>
>>>>>>>> The spec says:
>>>>>>>> The device MUST allow reading of any device-specific configuration field
>>>>>>>> before FEATURES_OK is set by the driver. This includes fields which are
>>>>>>>> conditional on feature bits, as long as those feature bits are offered by the
>>>>>>>> device.
>>>>>>>>
>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>
>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>> And
>>>>>>> It should be in format
>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if FEATURES_OK")
>>>>>>>
>>>>>>> Please use checkpatch.pl script before posting the patches to catch these errors.
>>>>>>> There is a bot that looks at the fixes tag and identifies the right kernel version to apply this fix.
>>>>>>>
>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>> ---
>>>>>>>>   drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>   1 file changed, 8 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>   	u32 device_id;
>>>>>>>>   	void *hdr;
>>>>>>>> -	u8 status;
>>>>>>>>   	int err;
>>>>>>>>
>>>>>>>>   	down_read(&vdev->cf_lock);
>>>>>>>> -	status = vdev->config->get_status(vdev);
>>>>>>>> -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>> -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>> completed");
>>>>>>>> -		err = -EAGAIN;
>>>>>>>> -		goto out;
>>>>>>>> -	}
>>>>>>>> -
>>>>>>>>   	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>>>>>>   			  VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>   	if (!hdr) {
>>>>>>>> --
>>>>>>>> 2.31.1
>>>>>>> _______________________________________________
>>>>>>> Virtualization mailing list
>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>>>>>>
>>>>>
>>>>
>>>
>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-07-29  9:39               ` Michael S. Tsirkin
  2022-07-29 10:01                 ` Zhu, Lingshan
@ 2022-08-01  4:33                 ` Jason Wang
  2022-08-01  6:25                   ` Michael S. Tsirkin
  1 sibling, 1 reply; 113+ messages in thread
From: Jason Wang @ 2022-08-01  4:33 UTC (permalink / raw)
  To: Michael S. Tsirkin, Zhu, Lingshan
  Cc: virtualization, netdev, parav, xieyongji, gautam.dawar


在 2022/7/29 17:39, Michael S. Tsirkin 写道:
> On Fri, Jul 29, 2022 at 05:35:09PM +0800, Zhu, Lingshan wrote:
>>
>> On 7/29/2022 5:23 PM, Michael S. Tsirkin wrote:
>>> On Fri, Jul 29, 2022 at 05:20:17PM +0800, Zhu, Lingshan wrote:
>>>> On 7/29/2022 5:17 PM, Michael S. Tsirkin wrote:
>>>>> On Fri, Jul 29, 2022 at 05:07:11PM +0800, Zhu, Lingshan wrote:
>>>>>> On 7/29/2022 4:53 PM, Michael S. Tsirkin wrote:
>>>>>>> On Fri, Jul 01, 2022 at 09:28:26PM +0800, Zhu Lingshan wrote:
>>>>>>>> This commit fixes spars warnings: cast to restricted __le16
>>>>>>>> in function vdpa_dev_net_config_fill() and
>>>>>>>> vdpa_fill_stats_rec()
>>>>>>>>
>>>>>>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>>>> ---
>>>>>>>>      drivers/vdpa/vdpa.c | 6 +++---
>>>>>>>>      1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>>>>>>>> index 846dd37f3549..ed49fe46a79e 100644
>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>> @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
>>>>>>>>      		    config.mac))
>>>>>>>>      		return -EMSGSIZE;
>>>>>>>> -	val_u16 = le16_to_cpu(config.status);
>>>>>>>> +	val_u16 = __virtio16_to_cpu(true, config.status);
>>>>>>>>      	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>>>      		return -EMSGSIZE;
>>>>>>>> -	val_u16 = le16_to_cpu(config.mtu);
>>>>>>>> +	val_u16 = __virtio16_to_cpu(true, config.mtu);
>>>>>>>>      	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>>>      		return -EMSGSIZE;
>>>>>>> Wrong on BE platforms with legacy interface, isn't it?
>>>>>>> We generally don't handle legacy properly in VDPA so it's
>>>>>>> not a huge deal, but maybe add a comment at least?
>>>>>> Sure, I can add a comment here: this is for modern devices only.
>>>>>>
>>>>>> Thanks,
>>>>>> Zhu Lingshan
>>>>> Hmm. what "this" is for modern devices only here?
>>>> this cast, for LE modern devices.
>>> I think status existed in legacy for sure, and it's possible that
>>> some legacy devices backported mtu and max_virtqueue_pairs otherwise
>>> we would have these fields as __le not as __virtio, right?
>> yes, that's the reason why it is virtio_16 than just le16.
>>
>> I may find a better solution to detect whether it is LE, or BE without a
>> virtio_dev structure.
>> Check whether vdpa_device->get_device_features() has VIRTIO_F_VERISON_1. If
>> the device offers _F_VERSION_1, then it is a LE device,
>> or it is a BE device, then we use __virtio16_to_cpu(false, config.status).
>>
>> Does this look good?
> No since the question is can be a legacy driver with a transitional
> device.  I don't have a good idea yet. vhost has VHOST_SET_VRING_ENDIAN
> and maybe we need something like this for config as well?


Not sure, and even if we had this, the query could happen before 
VHOST_SET_VRING_ENDIAN.

Actually, the patch should be fine itself, since the issue exist even 
before the patch (which assumes a le).

Thanks


>
>>>>>>>> @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
>>>>>>>>      	}
>>>>>>>>      	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>>>> -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
>>>>>>>> +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
>>>>>>>>      	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
>>>>>>>>      		return -EMSGSIZE;
>>>>>>>> -- 
>>>>>>>> 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-07-29 20:55                 ` Si-Wei Liu
@ 2022-08-01  4:44                   ` Jason Wang
  2022-08-01 22:53                     ` Si-Wei Liu
  0 siblings, 1 reply; 113+ messages in thread
From: Jason Wang @ 2022-08-01  4:44 UTC (permalink / raw)
  To: Si-Wei Liu, Zhu, Lingshan, Parav Pandit, mst, Eli Cohen
  Cc: netdev, xieyongji, gautam.dawar, virtualization


在 2022/7/30 04:55, Si-Wei Liu 写道:
>
>
> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>
>>
>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>
>>>
>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>
>>>>
>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>
>>>>>
>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>
>>>>>>
>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>> Sorry to chime in late in the game. For some reason I couldn't 
>>>>>>> get to most emails for this discussion (I only subscribed to the 
>>>>>>> virtualization list), while I was taking off amongst the past 
>>>>>>> few weeks.
>>>>>>>
>>>>>>> It looks to me this patch is incomplete. Noted down the way in 
>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>          features = vdev->config->get_driver_features(vdev);
>>>>>>>          if (nla_put_u64_64bit(msg, 
>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>                                VDPA_ATTR_PAD))
>>>>>>>                  return -EMSGSIZE;
>>>>>>>
>>>>>>> Making call to .get_driver_features() doesn't make sense when 
>>>>>>> feature negotiation isn't complete. Neither should present 
>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>
>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() 
>>>>>>> probably should not show before negotiation is done - it depends 
>>>>>>> on driver features negotiated.
>>>>>> I have another patch in this series introduces device_features 
>>>>>> and will report device_features to the userspace even features 
>>>>>> negotiation not done. Because the spec says we should allow 
>>>>>> driver access the config space before FEATURES_OK.
>>>>> The config space can be accessed by guest before features_ok 
>>>>> doesn't necessarily mean the value is valid. You may want to 
>>>>> double check with Michael for what he quoted earlier:
>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC, vDPA 
>>>> kernel should not return a mac to the userspace, there is not a 
>>>> default value for mac.
>>> Then please show us the code, as I can only comment based on your 
>>> latest (v4) patch and it was not there.. To be honest, I don't 
>>> understand the motivation and the use cases you have, is it for 
>>> debugging/monitoring or there's really a use case for live 
>>> migration? For the former, you can do a direct dump on all config 
>>> space fields regardless of endianess and feature negotiation without 
>>> having to worry about validity (meaningful to present to admin 
>>> user). To me these are conflict asks that is impossible to mix in 
>>> exact one command.
>> This bug just has been revealed two days, and you will see the patch 
>> soon.
>>
>> There are something to clarify:
>> 1) we need to read the device features, or how can you pick a proper 
>> LM destination


So it's probably not very efficient to use this, the manager layer 
should have the knowledge about the compatibility before doing migration 
other than try-and-fail.

And it's the task of the management to gather the nodes whose devices 
could be live migrated to each other as something like "cluster" which 
we've already used in the case of cpuflags.

1) during node bootstrap, the capability of each node and devices was 
reported to management layer
2) management layer decide the cluster and make sure the migration can 
only done among the nodes insides the cluster
3) before migration, the vDPA needs to be provisioned on the destination


>> 2) vdpa dev config show can show both device features and driver 
>> features, there just need a patch for iproute2
>> 3) To process information like MQ, we don't just dump the config 
>> space, MST has explained before
> So, it's for live migration... Then why not export those config 
> parameters specified for vdpa creation (as well as device feature 
> bits) to the output of "vdpa dev show" command? That's where device 
> side config lives and is static across vdpa's life cycle. "vdpa dev 
> config show" is mostly for dynamic driver side config, and the 
> validity is subject to feature negotiation. I suppose this should suit 
> your need of LM, e.g.


I think so.


>
> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 2000
> $ vdpa dev show vdpa1
> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs 15 
> max_vq_size 256
>   max_vqp 7 mtu 2000
>   dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ 
> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED


Note that the mgmt should know this destination have those 
capability/features before the provisioning.


>
> For it to work, you'd want to pass "struct vdpa_dev_set_config" to 
> _vdpa_register_device() during registration, and get it saved there in 
> "struct vdpa_device". Then in vdpa_dev_fill() show each field 
> conditionally subject to "struct vdpa_dev_set_config.mask".
>
> Thanks,
> -Siwei


Thanks


>>
>> Thanks
>> Zhu Lingshan
>>>
>>>>>> Nope:
>>>>>>
>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>
>>>>>> ...
>>>>>>
>>>>>> For optional configuration space fields, the driver MUST check 
>>>>>> that the corresponding feature is offered
>>>>>> before accessing that part of the configuration space.
>>>>>
>>>>> and how many driver bugs taking wrong assumption of the validity 
>>>>> of config space field without features_ok. I am not sure what use 
>>>>> case you want to expose config resister values for before 
>>>>> features_ok, if it's mostly for live migration I guess it's 
>>>>> probably heading a wrong direction.
>>>>>
>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Last but not the least, this "vdpa dev config" command was not 
>>>>>>> designed to display the real config space register values in the 
>>>>>>> first place. Quoting the vdpa-dev(8) man page:
>>>>>>>
>>>>>>>> vdpa dev config show - Show configuration of specific device or 
>>>>>>>> all devices.
>>>>>>>> DEV - specifies the vdpa device to show its configuration. If 
>>>>>>>> this argument is omitted all devices configuration is listed.
>>>>>>> It doesn't say anything about configuration space or register 
>>>>>>> values in config space. As long as it can convey the config 
>>>>>>> attribute when instantiating vDPA device instance, and more 
>>>>>>> importantly, the config can be easily imported from or exported 
>>>>>>> to userspace tools when trying to reconstruct vdpa instance 
>>>>>>> intact on destination host for live migration, IMHO in my 
>>>>>>> personal interpretation it doesn't matter what the config space 
>>>>>>> may present. It may be worth while adding a new debug command to 
>>>>>>> expose the real register value, but that's another story.
>>>>>> I am not sure getting your points. vDPA now reports device 
>>>>>> feature bits(device_features) and negotiated feature 
>>>>>> bits(driver_features), and yes, the drivers features can be a 
>>>>>> subset of the device features; and the vDPA device features can 
>>>>>> be a subset of the management device features.
>>>>> What I said is after unblocking the conditional check, you'd have 
>>>>> to handle the case for each of the vdpa attribute when feature 
>>>>> negotiation is not yet done: basically the register values you got 
>>>>> from config space via the vdpa_get_config_unlocked() call is not 
>>>>> considered to be valid before features_ok (per-spec). Although in 
>>>>> some case you may get sane value, such behavior is generally 
>>>>> undefined. If you desire to show just the device_features alone 
>>>>> without any config space field, which the device had advertised 
>>>>> *before feature negotiation is complete*, that'll be fine. But 
>>>>> looks to me this is not how patch has been implemented. Probably 
>>>>> need some more work?
>>>> They are driver_features(negotiated) and the device_features(which 
>>>> comes with the device), and the config space fields that depend on 
>>>> them. In this series, we report both to the userspace.
>>> I fail to understand what you want to present from your description. 
>>> May be worth showing some example outputs that at least include the 
>>> following cases: 1) when device offers features but not yet 
>>> acknowledge by guest 2) when guest acknowledged features and device 
>>> is yet to accept 3) after guest feature negotiation is completed 
>>> (agreed upon between guest and device).
>> Only two feature sets: 1) what the device has. (2) what is negotiated
>>>
>>> Thanks,
>>> -Siwei
>>>>>
>>>>> Regards,
>>>>> -Siwei
>>>>>
>>>>>>>
>>>>>>> Having said, please consider to drop the Fixes tag, as appears 
>>>>>>> to me you're proposing a new feature rather than fixing a real 
>>>>>>> issue.
>>>>>> it's a new feature to report the device feature bits than only 
>>>>>> negotiated features, however this patch is a must, or it will 
>>>>>> block the device feature bits reporting. but I agree, the fix tag 
>>>>>> is not a must.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> -Siwei
>>>>>>>
>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>
>>>>>>>>> Users may want to query the config space of a vDPA device, to 
>>>>>>>>> choose a
>>>>>>>>> appropriate one for a certain guest. This means the users need 
>>>>>>>>> to read the
>>>>>>>>> config space before FEATURES_OK, and the existence of config 
>>>>>>>>> space
>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>
>>>>>>>>> The spec says:
>>>>>>>>> The device MUST allow reading of any device-specific 
>>>>>>>>> configuration field
>>>>>>>>> before FEATURES_OK is set by the driver. This includes fields 
>>>>>>>>> which are
>>>>>>>>> conditional on feature bits, as long as those feature bits are 
>>>>>>>>> offered by the
>>>>>>>>> device.
>>>>>>>>>
>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if 
>>>>>>>>> FEATURES_OK)
>>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>>
>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>> And
>>>>>>>> It should be in format
>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if 
>>>>>>>> FEATURES_OK")
>>>>>>>>
>>>>>>>> Please use checkpatch.pl script before posting the patches to 
>>>>>>>> catch these errors.
>>>>>>>> There is a bot that looks at the fixes tag and identifies the 
>>>>>>>> right kernel version to apply this fix.
>>>>>>>>
>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>> ---
>>>>>>>>>   drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>   1 file changed, 8 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device 
>>>>>>>>> *vdev,
>>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>>       u32 device_id;
>>>>>>>>>       void *hdr;
>>>>>>>>> -    u8 status;
>>>>>>>>>       int err;
>>>>>>>>>
>>>>>>>>>       down_read(&vdev->cf_lock);
>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>>> completed");
>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>> -        goto out;
>>>>>>>>> -    }
>>>>>>>>> -
>>>>>>>>>       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>>>>>>>                 VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>       if (!hdr) {
>>>>>>>>> -- 
>>>>>>>>> 2.31.1
>>>>>>>> _______________________________________________
>>>>>>>> Virtualization mailing list
>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0
  2022-07-28  6:41                                             ` Michael S. Tsirkin
@ 2022-08-01  4:50                                               ` Jason Wang
  0 siblings, 0 replies; 113+ messages in thread
From: Jason Wang @ 2022-08-01  4:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Zhu, Lingshan, Parav Pandit, virtualization, netdev, xieyongji,
	gautam.dawar

On Thu, Jul 28, 2022 at 2:41 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Jul 28, 2022 at 01:53:51PM +0800, Jason Wang wrote:
> > On Thu, Jul 28, 2022 at 11:47 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
> > >
> > >
> > >
> > > On 7/28/2022 9:21 AM, Jason Wang wrote:
> > > > On Wed, Jul 27, 2022 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >> On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
> > > >>> On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >>>> On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> > > >>>>> On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > > >>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > >>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
> > > >>>>>>>>
> > > >>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > >>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > >>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
> > > >>>>>>>>>>
> > > >>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > >>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > >>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > > >>>>>>>>>>>>> When the user space which invokes netlink commands, detects that
> > > >>>>>>>>>> _MQ
> > > >>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > > >>>>>>>>>>>> I think the kernel module have all necessary information and it is
> > > >>>>>>>>>>>> the only one which have precise information of a device, so it
> > > >>>>>>>>>>>> should answer precisely than let the user space guess. The kernel
> > > >>>>>>>>>>>> module should be reliable than stay silent, leave the question to
> > > >>>>>>>>>>>> the user space
> > > >>>>>>>>>> tool.
> > > >>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
> > > >>>>>>>>>>> field doesn’t
> > > >>>>>>>>>> exist regardless of field should have default or no default.
> > > >>>>>>>>>> so when you know it is one queue pair, you should answer one, not try
> > > >>>>>>>>>> to guess.
> > > >>>>>>>>>>> User space should not guess either. User space gets to see if _MQ
> > > >>>>>>>>>> present/not present. If _MQ present than get reliable data from kernel.
> > > >>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
> > > >>>>>>>>>> it is still a guess, right? And all user space tools implemented this
> > > >>>>>>>>>> feature need to guess
> > > >>>>>>>>> No. it is not a guess.
> > > >>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
> > > >>>>>>>>> The code you proposed will be present in the user space.
> > > >>>>>>>>> It will be uniform for _MQ and 10 other features that are present now and
> > > >>>>>>>> in the future.
> > > >>>>>>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > >>>>>>>> are no attributes like max_rss_key_size, and there is not a default value.
> > > >>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
> > > >>>>>>> "we" = user space.
> > > >>>>>>> To keep the consistency among all the config space fields.
> > > >>>>>> Actually I looked and the code some more and I'm puzzled:
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>          struct virtio_net_config config = {};
> > > >>>>>>          u64 features;
> > > >>>>>>          u16 val_u16;
> > > >>>>>>
> > > >>>>>>          vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > > >>>>>>
> > > >>>>>>          if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > > >>>>>>                      config.mac))
> > > >>>>>>                  return -EMSGSIZE;
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Mac returned even without VIRTIO_NET_F_MAC
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>          val_u16 = le16_to_cpu(config.status);
> > > >>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > > >>>>>>                  return -EMSGSIZE;
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> status returned even without VIRTIO_NET_F_STATUS
> > > >>>>>>
> > > >>>>>>          val_u16 = le16_to_cpu(config.mtu);
> > > >>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > > >>>>>>                  return -EMSGSIZE;
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> MTU returned even without VIRTIO_NET_F_MTU
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> What's going on here?
> > > >>>>> Probably too late to fix, but this should be fine as long as all
> > > >>>>> parents support STATUS/MTU/MAC.
> > > >>>> Why is this too late to fix.
> > > >>> If we make this conditional on the features. This may break the
> > > >>> userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
> > > >>>
> > > >>> Thanks
> > > >> Well only on devices without MTU. I'm saying said userspace
> > > >> was reading trash on such devices anyway.
> > > > It depends on the parent actually. For example, mlx5 query the lower
> > > > mtu unconditionally:
> > > >
> > > >          err = query_mtu(mdev, &mtu);
> > > >          if (err)
> > > >                  goto err_alloc;
> > > >
> > > >          ndev->config.mtu = cpu_to_mlx5vdpa16(mvdev, mtu);
> > > >
> > > > Supporting MTU features seems to be a must for real hardware.
> > > > Otherwise the driver may not work correctly.
> > > >
> > > >> We don't generally maintain bug for bug compatiblity on a whim,
> > > >> only if userspace is actually known to break if we fix a bug.
> > > >   So I think it should be fine to make this conditional then we should
> > > > have a consistent handling of other fields like MQ.
> > > For some fields that have a default value, like MQ =1, we can return the
> > > default value.
> > > For other fields without a default value, like MAC, we return nothing.
> > >
> > > Does this sounds good? So, for MTU, if without _F_MTU, I think we can
> > > return 1500 by default.
> >
> > Or we can just read MTU from the device.
> >
> > But It looks to me Michael wants it conditional.
> >
> > Thanks
>
> I'm fine either way but let's keep it consistent. And I think
> Parav wants it conditional.

Parav, what's your opinion here?

Michale spots some in-consistent stuffs, so I think we should either

1) make all conditional, so we should change both MTU and MAC

or

2) make them unconditional, so we should only change MQ

Thanks

>
> > >
> > > Thanks,
> > > Zhu Lingshan
> > > >
> > > > Thanks
> > > >
> > > >>
> > > >>>>> I wonder if we can add a check in the core and fail the device
> > > >>>>> registration in this case.
> > > >>>>>
> > > >>>>> Thanks
> > > >>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> MST
> > > >>>>>>
> > >
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c
  2022-08-01  4:33                 ` Jason Wang
@ 2022-08-01  6:25                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 113+ messages in thread
From: Michael S. Tsirkin @ 2022-08-01  6:25 UTC (permalink / raw)
  To: Jason Wang
  Cc: Zhu, Lingshan, virtualization, netdev, parav, xieyongji, gautam.dawar

On Mon, Aug 01, 2022 at 12:33:44PM +0800, Jason Wang wrote:
> 
> 在 2022/7/29 17:39, Michael S. Tsirkin 写道:
> > On Fri, Jul 29, 2022 at 05:35:09PM +0800, Zhu, Lingshan wrote:
> > > 
> > > On 7/29/2022 5:23 PM, Michael S. Tsirkin wrote:
> > > > On Fri, Jul 29, 2022 at 05:20:17PM +0800, Zhu, Lingshan wrote:
> > > > > On 7/29/2022 5:17 PM, Michael S. Tsirkin wrote:
> > > > > > On Fri, Jul 29, 2022 at 05:07:11PM +0800, Zhu, Lingshan wrote:
> > > > > > > On 7/29/2022 4:53 PM, Michael S. Tsirkin wrote:
> > > > > > > > On Fri, Jul 01, 2022 at 09:28:26PM +0800, Zhu Lingshan wrote:
> > > > > > > > > This commit fixes spars warnings: cast to restricted __le16
> > > > > > > > > in function vdpa_dev_net_config_fill() and
> > > > > > > > > vdpa_fill_stats_rec()
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > > > > > > > > ---
> > > > > > > > >      drivers/vdpa/vdpa.c | 6 +++---
> > > > > > > > >      1 file changed, 3 insertions(+), 3 deletions(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > > > > > > index 846dd37f3549..ed49fe46a79e 100644
> > > > > > > > > --- a/drivers/vdpa/vdpa.c
> > > > > > > > > +++ b/drivers/vdpa/vdpa.c
> > > > > > > > > @@ -825,11 +825,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
> > > > > > > > >      		    config.mac))
> > > > > > > > >      		return -EMSGSIZE;
> > > > > > > > > -	val_u16 = le16_to_cpu(config.status);
> > > > > > > > > +	val_u16 = __virtio16_to_cpu(true, config.status);
> > > > > > > > >      	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > > > > > > > >      		return -EMSGSIZE;
> > > > > > > > > -	val_u16 = le16_to_cpu(config.mtu);
> > > > > > > > > +	val_u16 = __virtio16_to_cpu(true, config.mtu);
> > > > > > > > >      	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > > > > > > > >      		return -EMSGSIZE;
> > > > > > > > Wrong on BE platforms with legacy interface, isn't it?
> > > > > > > > We generally don't handle legacy properly in VDPA so it's
> > > > > > > > not a huge deal, but maybe add a comment at least?
> > > > > > > Sure, I can add a comment here: this is for modern devices only.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Zhu Lingshan
> > > > > > Hmm. what "this" is for modern devices only here?
> > > > > this cast, for LE modern devices.
> > > > I think status existed in legacy for sure, and it's possible that
> > > > some legacy devices backported mtu and max_virtqueue_pairs otherwise
> > > > we would have these fields as __le not as __virtio, right?
> > > yes, that's the reason why it is virtio_16 than just le16.
> > > 
> > > I may find a better solution to detect whether it is LE, or BE without a
> > > virtio_dev structure.
> > > Check whether vdpa_device->get_device_features() has VIRTIO_F_VERISON_1. If
> > > the device offers _F_VERSION_1, then it is a LE device,
> > > or it is a BE device, then we use __virtio16_to_cpu(false, config.status).
> > > 
> > > Does this look good?
> > No since the question is can be a legacy driver with a transitional
> > device.  I don't have a good idea yet. vhost has VHOST_SET_VRING_ENDIAN
> > and maybe we need something like this for config as well?
> 
> 
> Not sure, and even if we had this, the query could happen before
> VHOST_SET_VRING_ENDIAN.
> 
> Actually, the patch should be fine itself, since the issue exist even before
> the patch (which assumes a le).
> 
> Thanks


I agree, let's just add a TODO comment.

> 
> > 
> > > > > > > > > @@ -911,7 +911,7 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
> > > > > > > > >      	}
> > > > > > > > >      	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > > > > > > > > -	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
> > > > > > > > > +	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
> > > > > > > > >      	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
> > > > > > > > >      		return -EMSGSIZE;
> > > > > > > > > -- 
> > > > > > > > > 2.31.1


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-01  4:44                   ` Jason Wang
@ 2022-08-01 22:53                     ` Si-Wei Liu
  2022-08-01 22:58                       ` Si-Wei Liu
  0 siblings, 1 reply; 113+ messages in thread
From: Si-Wei Liu @ 2022-08-01 22:53 UTC (permalink / raw)
  To: Jason Wang, Zhu, Lingshan, Parav Pandit, mst, Eli Cohen
  Cc: netdev, xieyongji, gautam.dawar, virtualization



On 7/31/2022 9:44 PM, Jason Wang wrote:
>
> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>
>>
>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>
>>>
>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>
>>>>
>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>
>>>>>
>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>
>>>>>>
>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>> Sorry to chime in late in the game. For some reason I couldn't 
>>>>>>>> get to most emails for this discussion (I only subscribed to 
>>>>>>>> the virtualization list), while I was taking off amongst the 
>>>>>>>> past few weeks.
>>>>>>>>
>>>>>>>> It looks to me this patch is incomplete. Noted down the way in 
>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>          features = vdev->config->get_driver_features(vdev);
>>>>>>>>          if (nla_put_u64_64bit(msg, 
>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>                                VDPA_ATTR_PAD))
>>>>>>>>                  return -EMSGSIZE;
>>>>>>>>
>>>>>>>> Making call to .get_driver_features() doesn't make sense when 
>>>>>>>> feature negotiation isn't complete. Neither should present 
>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>
>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() 
>>>>>>>> probably should not show before negotiation is done - it 
>>>>>>>> depends on driver features negotiated.
>>>>>>> I have another patch in this series introduces device_features 
>>>>>>> and will report device_features to the userspace even features 
>>>>>>> negotiation not done. Because the spec says we should allow 
>>>>>>> driver access the config space before FEATURES_OK.
>>>>>> The config space can be accessed by guest before features_ok 
>>>>>> doesn't necessarily mean the value is valid. You may want to 
>>>>>> double check with Michael for what he quoted earlier:
>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC, 
>>>>> vDPA kernel should not return a mac to the userspace, there is not 
>>>>> a default value for mac.
>>>> Then please show us the code, as I can only comment based on your 
>>>> latest (v4) patch and it was not there.. To be honest, I don't 
>>>> understand the motivation and the use cases you have, is it for 
>>>> debugging/monitoring or there's really a use case for live 
>>>> migration? For the former, you can do a direct dump on all config 
>>>> space fields regardless of endianess and feature negotiation 
>>>> without having to worry about validity (meaningful to present to 
>>>> admin user). To me these are conflict asks that is impossible to 
>>>> mix in exact one command.
>>> This bug just has been revealed two days, and you will see the patch 
>>> soon.
>>>
>>> There are something to clarify:
>>> 1) we need to read the device features, or how can you pick a proper 
>>> LM destination
>
>
> So it's probably not very efficient to use this, the manager layer 
> should have the knowledge about the compatibility before doing 
> migration other than try-and-fail.
>
> And it's the task of the management to gather the nodes whose devices 
> could be live migrated to each other as something like "cluster" which 
> we've already used in the case of cpuflags.
>
> 1) during node bootstrap, the capability of each node and devices was 
> reported to management layer
> 2) management layer decide the cluster and make sure the migration can 
> only done among the nodes insides the cluster
> 3) before migration, the vDPA needs to be provisioned on the destination
>
>
>>> 2) vdpa dev config show can show both device features and driver 
>>> features, there just need a patch for iproute2
>>> 3) To process information like MQ, we don't just dump the config 
>>> space, MST has explained before
>> So, it's for live migration... Then why not export those config 
>> parameters specified for vdpa creation (as well as device feature 
>> bits) to the output of "vdpa dev show" command? That's where device 
>> side config lives and is static across vdpa's life cycle. "vdpa dev 
>> config show" is mostly for dynamic driver side config, and the 
>> validity is subject to feature negotiation. I suppose this should 
>> suit your need of LM, e.g.
>
>
> I think so.
>
>
>>
>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 2000
>> $ vdpa dev show vdpa1
>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs 
>> 15 max_vq_size 256
>>   max_vqp 7 mtu 2000
>>   dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ 
>> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>
>
> Note that the mgmt should know this destination have those 
> capability/features before the provisioning.
Yes, mgmt software should have to check the above from source.

>
>
>>
>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to 
>> _vdpa_register_device() during registration, and get it saved there 
>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field 
>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>
>> Thanks,
>> -Siwei
>
>
> Thanks
>
>
>>>
>>> Thanks
>>> Zhu Lingshan
>>>>
>>>>>>> Nope:
>>>>>>>
>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>> For optional configuration space fields, the driver MUST check 
>>>>>>> that the corresponding feature is offered
>>>>>>> before accessing that part of the configuration space.
>>>>>>
>>>>>> and how many driver bugs taking wrong assumption of the validity 
>>>>>> of config space field without features_ok. I am not sure what use 
>>>>>> case you want to expose config resister values for before 
>>>>>> features_ok, if it's mostly for live migration I guess it's 
>>>>>> probably heading a wrong direction.
>>>>>>
>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Last but not the least, this "vdpa dev config" command was not 
>>>>>>>> designed to display the real config space register values in 
>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>
>>>>>>>>> vdpa dev config show - Show configuration of specific device 
>>>>>>>>> or all devices.
>>>>>>>>> DEV - specifies the vdpa device to show its configuration. If 
>>>>>>>>> this argument is omitted all devices configuration is listed.
>>>>>>>> It doesn't say anything about configuration space or register 
>>>>>>>> values in config space. As long as it can convey the config 
>>>>>>>> attribute when instantiating vDPA device instance, and more 
>>>>>>>> importantly, the config can be easily imported from or exported 
>>>>>>>> to userspace tools when trying to reconstruct vdpa instance 
>>>>>>>> intact on destination host for live migration, IMHO in my 
>>>>>>>> personal interpretation it doesn't matter what the config space 
>>>>>>>> may present. It may be worth while adding a new debug command 
>>>>>>>> to expose the real register value, but that's another story.
>>>>>>> I am not sure getting your points. vDPA now reports device 
>>>>>>> feature bits(device_features) and negotiated feature 
>>>>>>> bits(driver_features), and yes, the drivers features can be a 
>>>>>>> subset of the device features; and the vDPA device features can 
>>>>>>> be a subset of the management device features.
>>>>>> What I said is after unblocking the conditional check, you'd have 
>>>>>> to handle the case for each of the vdpa attribute when feature 
>>>>>> negotiation is not yet done: basically the register values you 
>>>>>> got from config space via the vdpa_get_config_unlocked() call is 
>>>>>> not considered to be valid before features_ok (per-spec). 
>>>>>> Although in some case you may get sane value, such behavior is 
>>>>>> generally undefined. If you desire to show just the 
>>>>>> device_features alone without any config space field, which the 
>>>>>> device had advertised *before feature negotiation is complete*, 
>>>>>> that'll be fine. But looks to me this is not how patch has been 
>>>>>> implemented. Probably need some more work?
>>>>> They are driver_features(negotiated) and the device_features(which 
>>>>> comes with the device), and the config space fields that depend on 
>>>>> them. In this series, we report both to the userspace.
>>>> I fail to understand what you want to present from your 
>>>> description. May be worth showing some example outputs that at 
>>>> least include the following cases: 1) when device offers features 
>>>> but not yet acknowledge by guest 2) when guest acknowledged 
>>>> features and device is yet to accept 3) after guest feature 
>>>> negotiation is completed (agreed upon between guest and device).
>>> Only two feature sets: 1) what the device has. (2) what is negotiated
>>>>
>>>> Thanks,
>>>> -Siwei
>>>>>>
>>>>>> Regards,
>>>>>> -Siwei
>>>>>>
>>>>>>>>
>>>>>>>> Having said, please consider to drop the Fixes tag, as appears 
>>>>>>>> to me you're proposing a new feature rather than fixing a real 
>>>>>>>> issue.
>>>>>>> it's a new feature to report the device feature bits than only 
>>>>>>> negotiated features, however this patch is a must, or it will 
>>>>>>> block the device feature bits reporting. but I agree, the fix 
>>>>>>> tag is not a must.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> -Siwei
>>>>>>>>
>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>
>>>>>>>>>> Users may want to query the config space of a vDPA device, to 
>>>>>>>>>> choose a
>>>>>>>>>> appropriate one for a certain guest. This means the users 
>>>>>>>>>> need to read the
>>>>>>>>>> config space before FEATURES_OK, and the existence of config 
>>>>>>>>>> space
>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>
>>>>>>>>>> The spec says:
>>>>>>>>>> The device MUST allow reading of any device-specific 
>>>>>>>>>> configuration field
>>>>>>>>>> before FEATURES_OK is set by the driver. This includes fields 
>>>>>>>>>> which are
>>>>>>>>>> conditional on feature bits, as long as those feature bits 
>>>>>>>>>> are offered by the
>>>>>>>>>> device.
>>>>>>>>>>
>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if 
>>>>>>>>>> FEATURES_OK)
>>>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>>>
>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>> And
>>>>>>>>> It should be in format
>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if 
>>>>>>>>> FEATURES_OK")
>>>>>>>>>
>>>>>>>>> Please use checkpatch.pl script before posting the patches to 
>>>>>>>>> catch these errors.
>>>>>>>>> There is a bot that looks at the fixes tag and identifies the 
>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>
>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>> ---
>>>>>>>>>>   drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>   1 file changed, 8 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device 
>>>>>>>>>> *vdev,
>>>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>>>       u32 device_id;
>>>>>>>>>>       void *hdr;
>>>>>>>>>> -    u8 status;
>>>>>>>>>>       int err;
>>>>>>>>>>
>>>>>>>>>>       down_read(&vdev->cf_lock);
>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>>>> completed");
>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>> -        goto out;
>>>>>>>>>> -    }
>>>>>>>>>> -
>>>>>>>>>>       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, 
>>>>>>>>>> flags,
>>>>>>>>>>                 VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>       if (!hdr) {
>>>>>>>>>> -- 
>>>>>>>>>> 2.31.1
>>>>>>>>> _______________________________________________
>>>>>>>>> Virtualization mailing list
>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$ 
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-01 22:53                     ` Si-Wei Liu
@ 2022-08-01 22:58                       ` Si-Wei Liu
  2022-08-02  6:33                         ` Jason Wang
  0 siblings, 1 reply; 113+ messages in thread
From: Si-Wei Liu @ 2022-08-01 22:58 UTC (permalink / raw)
  To: Jason Wang, Zhu, Lingshan, Parav Pandit, mst, Eli Cohen
  Cc: netdev, xieyongji, gautam.dawar, virtualization



On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
>
>
> On 7/31/2022 9:44 PM, Jason Wang wrote:
>>
>> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>>
>>>
>>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>>
>>>>
>>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>>
>>>>>
>>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>>
>>>>>>
>>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>>> Sorry to chime in late in the game. For some reason I couldn't 
>>>>>>>>> get to most emails for this discussion (I only subscribed to 
>>>>>>>>> the virtualization list), while I was taking off amongst the 
>>>>>>>>> past few weeks.
>>>>>>>>>
>>>>>>>>> It looks to me this patch is incomplete. Noted down the way in 
>>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>>          features = vdev->config->get_driver_features(vdev);
>>>>>>>>>          if (nla_put_u64_64bit(msg, 
>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>>                                VDPA_ATTR_PAD))
>>>>>>>>>                  return -EMSGSIZE;
>>>>>>>>>
>>>>>>>>> Making call to .get_driver_features() doesn't make sense when 
>>>>>>>>> feature negotiation isn't complete. Neither should present 
>>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>>
>>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() 
>>>>>>>>> probably should not show before negotiation is done - it 
>>>>>>>>> depends on driver features negotiated.
>>>>>>>> I have another patch in this series introduces device_features 
>>>>>>>> and will report device_features to the userspace even features 
>>>>>>>> negotiation not done. Because the spec says we should allow 
>>>>>>>> driver access the config space before FEATURES_OK.
>>>>>>> The config space can be accessed by guest before features_ok 
>>>>>>> doesn't necessarily mean the value is valid. You may want to 
>>>>>>> double check with Michael for what he quoted earlier:
>>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC, 
>>>>>> vDPA kernel should not return a mac to the userspace, there is 
>>>>>> not a default value for mac.
>>>>> Then please show us the code, as I can only comment based on your 
>>>>> latest (v4) patch and it was not there.. To be honest, I don't 
>>>>> understand the motivation and the use cases you have, is it for 
>>>>> debugging/monitoring or there's really a use case for live 
>>>>> migration? For the former, you can do a direct dump on all config 
>>>>> space fields regardless of endianess and feature negotiation 
>>>>> without having to worry about validity (meaningful to present to 
>>>>> admin user). To me these are conflict asks that is impossible to 
>>>>> mix in exact one command.
>>>> This bug just has been revealed two days, and you will see the 
>>>> patch soon.
>>>>
>>>> There are something to clarify:
>>>> 1) we need to read the device features, or how can you pick a 
>>>> proper LM destination
>>
>>
>> So it's probably not very efficient to use this, the manager layer 
>> should have the knowledge about the compatibility before doing 
>> migration other than try-and-fail.
>>
>> And it's the task of the management to gather the nodes whose devices 
>> could be live migrated to each other as something like "cluster" 
>> which we've already used in the case of cpuflags.
>>
>> 1) during node bootstrap, the capability of each node and devices was 
>> reported to management layer
>> 2) management layer decide the cluster and make sure the migration 
>> can only done among the nodes insides the cluster
>> 3) before migration, the vDPA needs to be provisioned on the destination
>>
>>
>>>> 2) vdpa dev config show can show both device features and driver 
>>>> features, there just need a patch for iproute2
>>>> 3) To process information like MQ, we don't just dump the config 
>>>> space, MST has explained before
>>> So, it's for live migration... Then why not export those config 
>>> parameters specified for vdpa creation (as well as device feature 
>>> bits) to the output of "vdpa dev show" command? That's where device 
>>> side config lives and is static across vdpa's life cycle. "vdpa dev 
>>> config show" is mostly for dynamic driver side config, and the 
>>> validity is subject to feature negotiation. I suppose this should 
>>> suit your need of LM, e.g.
>>
>>
>> I think so.
>>
>>
>>>
>>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 2000
>>> $ vdpa dev show vdpa1
>>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs 
>>> 15 max_vq_size 256
>>>   max_vqp 7 mtu 2000
>>>   dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS 
>>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>
>>
>> Note that the mgmt should know this destination have those 
>> capability/features before the provisioning.
> Yes, mgmt software should have to check the above from source.

On destination mgmt software can run below to check vdpa mgmtdev's 
capability/features:

$ vdpa mgmtdev show pci/0000:41:04.3
pci/0000:41:04.3:
   supported_classes net
   max_supported_vqs 257
   dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ 
MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>
>>
>>
>>>
>>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to 
>>> _vdpa_register_device() during registration, and get it saved there 
>>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field 
>>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>>
>>> Thanks,
>>> -Siwei
>>
>>
>> Thanks
>>
>>
>>>>
>>>> Thanks
>>>> Zhu Lingshan
>>>>>
>>>>>>>> Nope:
>>>>>>>>
>>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>>
>>>>>>>> ...
>>>>>>>>
>>>>>>>> For optional configuration space fields, the driver MUST check 
>>>>>>>> that the corresponding feature is offered
>>>>>>>> before accessing that part of the configuration space.
>>>>>>>
>>>>>>> and how many driver bugs taking wrong assumption of the validity 
>>>>>>> of config space field without features_ok. I am not sure what 
>>>>>>> use case you want to expose config resister values for before 
>>>>>>> features_ok, if it's mostly for live migration I guess it's 
>>>>>>> probably heading a wrong direction.
>>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Last but not the least, this "vdpa dev config" command was not 
>>>>>>>>> designed to display the real config space register values in 
>>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>>
>>>>>>>>>> vdpa dev config show - Show configuration of specific device 
>>>>>>>>>> or all devices.
>>>>>>>>>> DEV - specifies the vdpa device to show its configuration. If 
>>>>>>>>>> this argument is omitted all devices configuration is listed.
>>>>>>>>> It doesn't say anything about configuration space or register 
>>>>>>>>> values in config space. As long as it can convey the config 
>>>>>>>>> attribute when instantiating vDPA device instance, and more 
>>>>>>>>> importantly, the config can be easily imported from or 
>>>>>>>>> exported to userspace tools when trying to reconstruct vdpa 
>>>>>>>>> instance intact on destination host for live migration, IMHO 
>>>>>>>>> in my personal interpretation it doesn't matter what the 
>>>>>>>>> config space may present. It may be worth while adding a new 
>>>>>>>>> debug command to expose the real register value, but that's 
>>>>>>>>> another story.
>>>>>>>> I am not sure getting your points. vDPA now reports device 
>>>>>>>> feature bits(device_features) and negotiated feature 
>>>>>>>> bits(driver_features), and yes, the drivers features can be a 
>>>>>>>> subset of the device features; and the vDPA device features can 
>>>>>>>> be a subset of the management device features.
>>>>>>> What I said is after unblocking the conditional check, you'd 
>>>>>>> have to handle the case for each of the vdpa attribute when 
>>>>>>> feature negotiation is not yet done: basically the register 
>>>>>>> values you got from config space via the 
>>>>>>> vdpa_get_config_unlocked() call is not considered to be valid 
>>>>>>> before features_ok (per-spec). Although in some case you may get 
>>>>>>> sane value, such behavior is generally undefined. If you desire 
>>>>>>> to show just the device_features alone without any config space 
>>>>>>> field, which the device had advertised *before feature 
>>>>>>> negotiation is complete*, that'll be fine. But looks to me this 
>>>>>>> is not how patch has been implemented. Probably need some more 
>>>>>>> work?
>>>>>> They are driver_features(negotiated) and the 
>>>>>> device_features(which comes with the device), and the config 
>>>>>> space fields that depend on them. In this series, we report both 
>>>>>> to the userspace.
>>>>> I fail to understand what you want to present from your 
>>>>> description. May be worth showing some example outputs that at 
>>>>> least include the following cases: 1) when device offers features 
>>>>> but not yet acknowledge by guest 2) when guest acknowledged 
>>>>> features and device is yet to accept 3) after guest feature 
>>>>> negotiation is completed (agreed upon between guest and device).
>>>> Only two feature sets: 1) what the device has. (2) what is negotiated
>>>>>
>>>>> Thanks,
>>>>> -Siwei
>>>>>>>
>>>>>>> Regards,
>>>>>>> -Siwei
>>>>>>>
>>>>>>>>>
>>>>>>>>> Having said, please consider to drop the Fixes tag, as appears 
>>>>>>>>> to me you're proposing a new feature rather than fixing a real 
>>>>>>>>> issue.
>>>>>>>> it's a new feature to report the device feature bits than only 
>>>>>>>> negotiated features, however this patch is a must, or it will 
>>>>>>>> block the device feature bits reporting. but I agree, the fix 
>>>>>>>> tag is not a must.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> -Siwei
>>>>>>>>>
>>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>>
>>>>>>>>>>> Users may want to query the config space of a vDPA device, 
>>>>>>>>>>> to choose a
>>>>>>>>>>> appropriate one for a certain guest. This means the users 
>>>>>>>>>>> need to read the
>>>>>>>>>>> config space before FEATURES_OK, and the existence of config 
>>>>>>>>>>> space
>>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>>
>>>>>>>>>>> The spec says:
>>>>>>>>>>> The device MUST allow reading of any device-specific 
>>>>>>>>>>> configuration field
>>>>>>>>>>> before FEATURES_OK is set by the driver. This includes 
>>>>>>>>>>> fields which are
>>>>>>>>>>> conditional on feature bits, as long as those feature bits 
>>>>>>>>>>> are offered by the
>>>>>>>>>>> device.
>>>>>>>>>>>
>>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only 
>>>>>>>>>>> if FEATURES_OK)
>>>>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>>>>
>>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>>> And
>>>>>>>>>> It should be in format
>>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if 
>>>>>>>>>> FEATURES_OK")
>>>>>>>>>>
>>>>>>>>>> Please use checkpatch.pl script before posting the patches to 
>>>>>>>>>> catch these errors.
>>>>>>>>>> There is a bot that looks at the fixes tag and identifies the 
>>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>> ---
>>>>>>>>>>>   drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>>   1 file changed, 8 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device 
>>>>>>>>>>> *vdev,
>>>>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>>>>       u32 device_id;
>>>>>>>>>>>       void *hdr;
>>>>>>>>>>> -    u8 status;
>>>>>>>>>>>       int err;
>>>>>>>>>>>
>>>>>>>>>>>       down_read(&vdev->cf_lock);
>>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>>>>> completed");
>>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>>> -        goto out;
>>>>>>>>>>> -    }
>>>>>>>>>>> -
>>>>>>>>>>>       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, 
>>>>>>>>>>> flags,
>>>>>>>>>>>                 VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>>       if (!hdr) {
>>>>>>>>>>> -- 
>>>>>>>>>>> 2.31.1
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Virtualization mailing list
>>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-01 22:58                       ` Si-Wei Liu
@ 2022-08-02  6:33                         ` Jason Wang
  2022-08-03  1:26                           ` Si-Wei Liu
  0 siblings, 1 reply; 113+ messages in thread
From: Jason Wang @ 2022-08-02  6:33 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: Zhu, Lingshan, Parav Pandit, mst, Eli Cohen, netdev, xieyongji,
	gautam.dawar, virtualization

On Tue, Aug 2, 2022 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
>
>
> On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
> >
> >
> > On 7/31/2022 9:44 PM, Jason Wang wrote:
> >>
> >> 在 2022/7/30 04:55, Si-Wei Liu 写道:
> >>>
> >>>
> >>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
> >>>>
> >>>>
> >>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
> >>>>>
> >>>>>
> >>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
> >>>>>>
> >>>>>>
> >>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
> >>>>>>>>> Sorry to chime in late in the game. For some reason I couldn't
> >>>>>>>>> get to most emails for this discussion (I only subscribed to
> >>>>>>>>> the virtualization list), while I was taking off amongst the
> >>>>>>>>> past few weeks.
> >>>>>>>>>
> >>>>>>>>> It looks to me this patch is incomplete. Noted down the way in
> >>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
> >>>>>>>>>          features = vdev->config->get_driver_features(vdev);
> >>>>>>>>>          if (nla_put_u64_64bit(msg,
> >>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
> >>>>>>>>>                                VDPA_ATTR_PAD))
> >>>>>>>>>                  return -EMSGSIZE;
> >>>>>>>>>
> >>>>>>>>> Making call to .get_driver_features() doesn't make sense when
> >>>>>>>>> feature negotiation isn't complete. Neither should present
> >>>>>>>>> negotiated_features to userspace before negotiation is done.
> >>>>>>>>>
> >>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
> >>>>>>>>> probably should not show before negotiation is done - it
> >>>>>>>>> depends on driver features negotiated.
> >>>>>>>> I have another patch in this series introduces device_features
> >>>>>>>> and will report device_features to the userspace even features
> >>>>>>>> negotiation not done. Because the spec says we should allow
> >>>>>>>> driver access the config space before FEATURES_OK.
> >>>>>>> The config space can be accessed by guest before features_ok
> >>>>>>> doesn't necessarily mean the value is valid. You may want to
> >>>>>>> double check with Michael for what he quoted earlier:
> >>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC,
> >>>>>> vDPA kernel should not return a mac to the userspace, there is
> >>>>>> not a default value for mac.
> >>>>> Then please show us the code, as I can only comment based on your
> >>>>> latest (v4) patch and it was not there.. To be honest, I don't
> >>>>> understand the motivation and the use cases you have, is it for
> >>>>> debugging/monitoring or there's really a use case for live
> >>>>> migration? For the former, you can do a direct dump on all config
> >>>>> space fields regardless of endianess and feature negotiation
> >>>>> without having to worry about validity (meaningful to present to
> >>>>> admin user). To me these are conflict asks that is impossible to
> >>>>> mix in exact one command.
> >>>> This bug just has been revealed two days, and you will see the
> >>>> patch soon.
> >>>>
> >>>> There are something to clarify:
> >>>> 1) we need to read the device features, or how can you pick a
> >>>> proper LM destination
> >>
> >>
> >> So it's probably not very efficient to use this, the manager layer
> >> should have the knowledge about the compatibility before doing
> >> migration other than try-and-fail.
> >>
> >> And it's the task of the management to gather the nodes whose devices
> >> could be live migrated to each other as something like "cluster"
> >> which we've already used in the case of cpuflags.
> >>
> >> 1) during node bootstrap, the capability of each node and devices was
> >> reported to management layer
> >> 2) management layer decide the cluster and make sure the migration
> >> can only done among the nodes insides the cluster
> >> 3) before migration, the vDPA needs to be provisioned on the destination
> >>
> >>
> >>>> 2) vdpa dev config show can show both device features and driver
> >>>> features, there just need a patch for iproute2
> >>>> 3) To process information like MQ, we don't just dump the config
> >>>> space, MST has explained before
> >>> So, it's for live migration... Then why not export those config
> >>> parameters specified for vdpa creation (as well as device feature
> >>> bits) to the output of "vdpa dev show" command? That's where device
> >>> side config lives and is static across vdpa's life cycle. "vdpa dev
> >>> config show" is mostly for dynamic driver side config, and the
> >>> validity is subject to feature negotiation. I suppose this should
> >>> suit your need of LM, e.g.
> >>
> >>
> >> I think so.
> >>
> >>
> >>>
> >>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 2000
> >>> $ vdpa dev show vdpa1
> >>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs
> >>> 15 max_vq_size 256
> >>>   max_vqp 7 mtu 2000
> >>>   dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS
> >>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
> >>
> >>
> >> Note that the mgmt should know this destination have those
> >> capability/features before the provisioning.
> > Yes, mgmt software should have to check the above from source.
>
> On destination mgmt software can run below to check vdpa mgmtdev's
> capability/features:
>
> $ vdpa mgmtdev show pci/0000:41:04.3
> pci/0000:41:04.3:
>    supported_classes net
>    max_supported_vqs 257
>    dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ
> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED

Right and this is probably better to be done at node bootstrapping for
the management to know about the cluster.

Thanks

> >
> >>
> >>
> >>>
> >>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to
> >>> _vdpa_register_device() during registration, and get it saved there
> >>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field
> >>> conditionally subject to "struct vdpa_dev_set_config.mask".
> >>>
> >>> Thanks,
> >>> -Siwei
> >>
> >>
> >> Thanks
> >>
> >>
> >>>>
> >>>> Thanks
> >>>> Zhu Lingshan
> >>>>>
> >>>>>>>> Nope:
> >>>>>>>>
> >>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
> >>>>>>>>
> >>>>>>>> ...
> >>>>>>>>
> >>>>>>>> For optional configuration space fields, the driver MUST check
> >>>>>>>> that the corresponding feature is offered
> >>>>>>>> before accessing that part of the configuration space.
> >>>>>>>
> >>>>>>> and how many driver bugs taking wrong assumption of the validity
> >>>>>>> of config space field without features_ok. I am not sure what
> >>>>>>> use case you want to expose config resister values for before
> >>>>>>> features_ok, if it's mostly for live migration I guess it's
> >>>>>>> probably heading a wrong direction.
> >>>>>>>
> >>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Last but not the least, this "vdpa dev config" command was not
> >>>>>>>>> designed to display the real config space register values in
> >>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
> >>>>>>>>>
> >>>>>>>>>> vdpa dev config show - Show configuration of specific device
> >>>>>>>>>> or all devices.
> >>>>>>>>>> DEV - specifies the vdpa device to show its configuration. If
> >>>>>>>>>> this argument is omitted all devices configuration is listed.
> >>>>>>>>> It doesn't say anything about configuration space or register
> >>>>>>>>> values in config space. As long as it can convey the config
> >>>>>>>>> attribute when instantiating vDPA device instance, and more
> >>>>>>>>> importantly, the config can be easily imported from or
> >>>>>>>>> exported to userspace tools when trying to reconstruct vdpa
> >>>>>>>>> instance intact on destination host for live migration, IMHO
> >>>>>>>>> in my personal interpretation it doesn't matter what the
> >>>>>>>>> config space may present. It may be worth while adding a new
> >>>>>>>>> debug command to expose the real register value, but that's
> >>>>>>>>> another story.
> >>>>>>>> I am not sure getting your points. vDPA now reports device
> >>>>>>>> feature bits(device_features) and negotiated feature
> >>>>>>>> bits(driver_features), and yes, the drivers features can be a
> >>>>>>>> subset of the device features; and the vDPA device features can
> >>>>>>>> be a subset of the management device features.
> >>>>>>> What I said is after unblocking the conditional check, you'd
> >>>>>>> have to handle the case for each of the vdpa attribute when
> >>>>>>> feature negotiation is not yet done: basically the register
> >>>>>>> values you got from config space via the
> >>>>>>> vdpa_get_config_unlocked() call is not considered to be valid
> >>>>>>> before features_ok (per-spec). Although in some case you may get
> >>>>>>> sane value, such behavior is generally undefined. If you desire
> >>>>>>> to show just the device_features alone without any config space
> >>>>>>> field, which the device had advertised *before feature
> >>>>>>> negotiation is complete*, that'll be fine. But looks to me this
> >>>>>>> is not how patch has been implemented. Probably need some more
> >>>>>>> work?
> >>>>>> They are driver_features(negotiated) and the
> >>>>>> device_features(which comes with the device), and the config
> >>>>>> space fields that depend on them. In this series, we report both
> >>>>>> to the userspace.
> >>>>> I fail to understand what you want to present from your
> >>>>> description. May be worth showing some example outputs that at
> >>>>> least include the following cases: 1) when device offers features
> >>>>> but not yet acknowledge by guest 2) when guest acknowledged
> >>>>> features and device is yet to accept 3) after guest feature
> >>>>> negotiation is completed (agreed upon between guest and device).
> >>>> Only two feature sets: 1) what the device has. (2) what is negotiated
> >>>>>
> >>>>> Thanks,
> >>>>> -Siwei
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> -Siwei
> >>>>>>>
> >>>>>>>>>
> >>>>>>>>> Having said, please consider to drop the Fixes tag, as appears
> >>>>>>>>> to me you're proposing a new feature rather than fixing a real
> >>>>>>>>> issue.
> >>>>>>>> it's a new feature to report the device feature bits than only
> >>>>>>>> negotiated features, however this patch is a must, or it will
> >>>>>>>> block the device feature bits reporting. but I agree, the fix
> >>>>>>>> tag is not a must.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> -Siwei
> >>>>>>>>>
> >>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
> >>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
> >>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
> >>>>>>>>>>>
> >>>>>>>>>>> Users may want to query the config space of a vDPA device,
> >>>>>>>>>>> to choose a
> >>>>>>>>>>> appropriate one for a certain guest. This means the users
> >>>>>>>>>>> need to read the
> >>>>>>>>>>> config space before FEATURES_OK, and the existence of config
> >>>>>>>>>>> space
> >>>>>>>>>>> contents does not depend on FEATURES_OK.
> >>>>>>>>>>>
> >>>>>>>>>>> The spec says:
> >>>>>>>>>>> The device MUST allow reading of any device-specific
> >>>>>>>>>>> configuration field
> >>>>>>>>>>> before FEATURES_OK is set by the driver. This includes
> >>>>>>>>>>> fields which are
> >>>>>>>>>>> conditional on feature bits, as long as those feature bits
> >>>>>>>>>>> are offered by the
> >>>>>>>>>>> device.
> >>>>>>>>>>>
> >>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only
> >>>>>>>>>>> if FEATURES_OK)
> >>>>>>>>>> Fix is fine, but fixes tag needs correction described below.
> >>>>>>>>>>
> >>>>>>>>>> Above commit id is 13 letters should be 12.
> >>>>>>>>>> And
> >>>>>>>>>> It should be in format
> >>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if
> >>>>>>>>>> FEATURES_OK")
> >>>>>>>>>>
> >>>>>>>>>> Please use checkpatch.pl script before posting the patches to
> >>>>>>>>>> catch these errors.
> >>>>>>>>>> There is a bot that looks at the fixes tag and identifies the
> >>>>>>>>>> right kernel version to apply this fix.
> >>>>>>>>>>
> >>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
> >>>>>>>>>>> ---
> >>>>>>>>>>>   drivers/vdpa/vdpa.c | 8 --------
> >>>>>>>>>>>   1 file changed, 8 deletions(-)
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> >>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
> >>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
> >>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
> >>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device
> >>>>>>>>>>> *vdev,
> >>>>>>>>>>> struct sk_buff *msg, u32 portid,  {
> >>>>>>>>>>>       u32 device_id;
> >>>>>>>>>>>       void *hdr;
> >>>>>>>>>>> -    u8 status;
> >>>>>>>>>>>       int err;
> >>>>>>>>>>>
> >>>>>>>>>>>       down_read(&vdev->cf_lock);
> >>>>>>>>>>> -    status = vdev->config->get_status(vdev);
> >>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
> >>>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
> >>>>>>>>>>> completed");
> >>>>>>>>>>> -        err = -EAGAIN;
> >>>>>>>>>>> -        goto out;
> >>>>>>>>>>> -    }
> >>>>>>>>>>> -
> >>>>>>>>>>>       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family,
> >>>>>>>>>>> flags,
> >>>>>>>>>>>                 VDPA_CMD_DEV_CONFIG_GET);
> >>>>>>>>>>>       if (!hdr) {
> >>>>>>>>>>> --
> >>>>>>>>>>> 2.31.1
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> Virtualization mailing list
> >>>>>>>>>> Virtualization@lists.linux-foundation.org
> >>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-02  6:33                         ` Jason Wang
@ 2022-08-03  1:26                           ` Si-Wei Liu
  2022-08-03  2:30                             ` Zhu, Lingshan
  0 siblings, 1 reply; 113+ messages in thread
From: Si-Wei Liu @ 2022-08-03  1:26 UTC (permalink / raw)
  To: Jason Wang
  Cc: Zhu, Lingshan, Parav Pandit, mst, Eli Cohen, netdev, xieyongji,
	gautam.dawar, virtualization



On 8/1/2022 11:33 PM, Jason Wang wrote:
> On Tue, Aug 2, 2022 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>
>>
>> On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
>>>
>>> On 7/31/2022 9:44 PM, Jason Wang wrote:
>>>> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>>>>
>>>>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>>>>
>>>>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>>>>
>>>>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>>>>
>>>>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>>>>
>>>>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>>>>
>>>>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>>>>> Sorry to chime in late in the game. For some reason I couldn't
>>>>>>>>>>> get to most emails for this discussion (I only subscribed to
>>>>>>>>>>> the virtualization list), while I was taking off amongst the
>>>>>>>>>>> past few weeks.
>>>>>>>>>>>
>>>>>>>>>>> It looks to me this patch is incomplete. Noted down the way in
>>>>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>>>>           features = vdev->config->get_driver_features(vdev);
>>>>>>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>>>>                                 VDPA_ATTR_PAD))
>>>>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>>>>
>>>>>>>>>>> Making call to .get_driver_features() doesn't make sense when
>>>>>>>>>>> feature negotiation isn't complete. Neither should present
>>>>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>>>>
>>>>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
>>>>>>>>>>> probably should not show before negotiation is done - it
>>>>>>>>>>> depends on driver features negotiated.
>>>>>>>>>> I have another patch in this series introduces device_features
>>>>>>>>>> and will report device_features to the userspace even features
>>>>>>>>>> negotiation not done. Because the spec says we should allow
>>>>>>>>>> driver access the config space before FEATURES_OK.
>>>>>>>>> The config space can be accessed by guest before features_ok
>>>>>>>>> doesn't necessarily mean the value is valid. You may want to
>>>>>>>>> double check with Michael for what he quoted earlier:
>>>>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC,
>>>>>>>> vDPA kernel should not return a mac to the userspace, there is
>>>>>>>> not a default value for mac.
>>>>>>> Then please show us the code, as I can only comment based on your
>>>>>>> latest (v4) patch and it was not there.. To be honest, I don't
>>>>>>> understand the motivation and the use cases you have, is it for
>>>>>>> debugging/monitoring or there's really a use case for live
>>>>>>> migration? For the former, you can do a direct dump on all config
>>>>>>> space fields regardless of endianess and feature negotiation
>>>>>>> without having to worry about validity (meaningful to present to
>>>>>>> admin user). To me these are conflict asks that is impossible to
>>>>>>> mix in exact one command.
>>>>>> This bug just has been revealed two days, and you will see the
>>>>>> patch soon.
>>>>>>
>>>>>> There are something to clarify:
>>>>>> 1) we need to read the device features, or how can you pick a
>>>>>> proper LM destination
>>>>
>>>> So it's probably not very efficient to use this, the manager layer
>>>> should have the knowledge about the compatibility before doing
>>>> migration other than try-and-fail.
>>>>
>>>> And it's the task of the management to gather the nodes whose devices
>>>> could be live migrated to each other as something like "cluster"
>>>> which we've already used in the case of cpuflags.
>>>>
>>>> 1) during node bootstrap, the capability of each node and devices was
>>>> reported to management layer
>>>> 2) management layer decide the cluster and make sure the migration
>>>> can only done among the nodes insides the cluster
>>>> 3) before migration, the vDPA needs to be provisioned on the destination
>>>>
>>>>
>>>>>> 2) vdpa dev config show can show both device features and driver
>>>>>> features, there just need a patch for iproute2
>>>>>> 3) To process information like MQ, we don't just dump the config
>>>>>> space, MST has explained before
>>>>> So, it's for live migration... Then why not export those config
>>>>> parameters specified for vdpa creation (as well as device feature
>>>>> bits) to the output of "vdpa dev show" command? That's where device
>>>>> side config lives and is static across vdpa's life cycle. "vdpa dev
>>>>> config show" is mostly for dynamic driver side config, and the
>>>>> validity is subject to feature negotiation. I suppose this should
>>>>> suit your need of LM, e.g.
>>>>
>>>> I think so.
>>>>
>>>>
>>>>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 2000
>>>>> $ vdpa dev show vdpa1
>>>>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs
>>>>> 15 max_vq_size 256
>>>>>    max_vqp 7 mtu 2000
>>>>>    dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS
>>>>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>>
>>>> Note that the mgmt should know this destination have those
>>>> capability/features before the provisioning.
>>> Yes, mgmt software should have to check the above from source.
>> On destination mgmt software can run below to check vdpa mgmtdev's
>> capability/features:
>>
>> $ vdpa mgmtdev show pci/0000:41:04.3
>> pci/0000:41:04.3:
>>     supported_classes net
>>     max_supported_vqs 257
>>     dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ
>> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
> Right and this is probably better to be done at node bootstrapping for
> the management to know about the cluster.
Exactly. That's what mgmt software is supposed to do typically.

Thanks,
-Siwei

>
> Thanks
>
>>>>
>>>>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to
>>>>> _vdpa_register_device() during registration, and get it saved there
>>>>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field
>>>>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>>>>
>>>>> Thanks,
>>>>> -Siwei
>>>>
>>>> Thanks
>>>>
>>>>
>>>>>> Thanks
>>>>>> Zhu Lingshan
>>>>>>>>>> Nope:
>>>>>>>>>>
>>>>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>>>>
>>>>>>>>>> ...
>>>>>>>>>>
>>>>>>>>>> For optional configuration space fields, the driver MUST check
>>>>>>>>>> that the corresponding feature is offered
>>>>>>>>>> before accessing that part of the configuration space.
>>>>>>>>> and how many driver bugs taking wrong assumption of the validity
>>>>>>>>> of config space field without features_ok. I am not sure what
>>>>>>>>> use case you want to expose config resister values for before
>>>>>>>>> features_ok, if it's mostly for live migration I guess it's
>>>>>>>>> probably heading a wrong direction.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Last but not the least, this "vdpa dev config" command was not
>>>>>>>>>>> designed to display the real config space register values in
>>>>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>>>>
>>>>>>>>>>>> vdpa dev config show - Show configuration of specific device
>>>>>>>>>>>> or all devices.
>>>>>>>>>>>> DEV - specifies the vdpa device to show its configuration. If
>>>>>>>>>>>> this argument is omitted all devices configuration is listed.
>>>>>>>>>>> It doesn't say anything about configuration space or register
>>>>>>>>>>> values in config space. As long as it can convey the config
>>>>>>>>>>> attribute when instantiating vDPA device instance, and more
>>>>>>>>>>> importantly, the config can be easily imported from or
>>>>>>>>>>> exported to userspace tools when trying to reconstruct vdpa
>>>>>>>>>>> instance intact on destination host for live migration, IMHO
>>>>>>>>>>> in my personal interpretation it doesn't matter what the
>>>>>>>>>>> config space may present. It may be worth while adding a new
>>>>>>>>>>> debug command to expose the real register value, but that's
>>>>>>>>>>> another story.
>>>>>>>>>> I am not sure getting your points. vDPA now reports device
>>>>>>>>>> feature bits(device_features) and negotiated feature
>>>>>>>>>> bits(driver_features), and yes, the drivers features can be a
>>>>>>>>>> subset of the device features; and the vDPA device features can
>>>>>>>>>> be a subset of the management device features.
>>>>>>>>> What I said is after unblocking the conditional check, you'd
>>>>>>>>> have to handle the case for each of the vdpa attribute when
>>>>>>>>> feature negotiation is not yet done: basically the register
>>>>>>>>> values you got from config space via the
>>>>>>>>> vdpa_get_config_unlocked() call is not considered to be valid
>>>>>>>>> before features_ok (per-spec). Although in some case you may get
>>>>>>>>> sane value, such behavior is generally undefined. If you desire
>>>>>>>>> to show just the device_features alone without any config space
>>>>>>>>> field, which the device had advertised *before feature
>>>>>>>>> negotiation is complete*, that'll be fine. But looks to me this
>>>>>>>>> is not how patch has been implemented. Probably need some more
>>>>>>>>> work?
>>>>>>>> They are driver_features(negotiated) and the
>>>>>>>> device_features(which comes with the device), and the config
>>>>>>>> space fields that depend on them. In this series, we report both
>>>>>>>> to the userspace.
>>>>>>> I fail to understand what you want to present from your
>>>>>>> description. May be worth showing some example outputs that at
>>>>>>> least include the following cases: 1) when device offers features
>>>>>>> but not yet acknowledge by guest 2) when guest acknowledged
>>>>>>> features and device is yet to accept 3) after guest feature
>>>>>>> negotiation is completed (agreed upon between guest and device).
>>>>>> Only two feature sets: 1) what the device has. (2) what is negotiated
>>>>>>> Thanks,
>>>>>>> -Siwei
>>>>>>>>> Regards,
>>>>>>>>> -Siwei
>>>>>>>>>
>>>>>>>>>>> Having said, please consider to drop the Fixes tag, as appears
>>>>>>>>>>> to me you're proposing a new feature rather than fixing a real
>>>>>>>>>>> issue.
>>>>>>>>>> it's a new feature to report the device feature bits than only
>>>>>>>>>> negotiated features, however this patch is a must, or it will
>>>>>>>>>> block the device feature bits reporting. but I agree, the fix
>>>>>>>>>> tag is not a must.
>>>>>>>>>>> Thanks,
>>>>>>>>>>> -Siwei
>>>>>>>>>>>
>>>>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>>>>
>>>>>>>>>>>>> Users may want to query the config space of a vDPA device,
>>>>>>>>>>>>> to choose a
>>>>>>>>>>>>> appropriate one for a certain guest. This means the users
>>>>>>>>>>>>> need to read the
>>>>>>>>>>>>> config space before FEATURES_OK, and the existence of config
>>>>>>>>>>>>> space
>>>>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The spec says:
>>>>>>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>>>>>>> configuration field
>>>>>>>>>>>>> before FEATURES_OK is set by the driver. This includes
>>>>>>>>>>>>> fields which are
>>>>>>>>>>>>> conditional on feature bits, as long as those feature bits
>>>>>>>>>>>>> are offered by the
>>>>>>>>>>>>> device.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only
>>>>>>>>>>>>> if FEATURES_OK)
>>>>>>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>>>>>>
>>>>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>>>>> And
>>>>>>>>>>>> It should be in format
>>>>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if
>>>>>>>>>>>> FEATURES_OK")
>>>>>>>>>>>>
>>>>>>>>>>>> Please use checkpatch.pl script before posting the patches to
>>>>>>>>>>>> catch these errors.
>>>>>>>>>>>> There is a bot that looks at the fixes tag and identifies the
>>>>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device
>>>>>>>>>>>>> *vdev,
>>>>>>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>>>>>>        u32 device_id;
>>>>>>>>>>>>>        void *hdr;
>>>>>>>>>>>>> -    u8 status;
>>>>>>>>>>>>>        int err;
>>>>>>>>>>>>>
>>>>>>>>>>>>>        down_read(&vdev->cf_lock);
>>>>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>>>>>>> completed");
>>>>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>>>>> -        goto out;
>>>>>>>>>>>>> -    }
>>>>>>>>>>>>> -
>>>>>>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family,
>>>>>>>>>>>>> flags,
>>>>>>>>>>>>>                  VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>>>>        if (!hdr) {
>>>>>>>>>>>>> --
>>>>>>>>>>>>> 2.31.1
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Virtualization mailing list
>>>>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$
>>>>>>>>>>>
>>>>>>>>>>>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-03  1:26                           ` Si-Wei Liu
@ 2022-08-03  2:30                             ` Zhu, Lingshan
  2022-08-03 23:09                               ` Si-Wei Liu
  0 siblings, 1 reply; 113+ messages in thread
From: Zhu, Lingshan @ 2022-08-03  2:30 UTC (permalink / raw)
  To: Si-Wei Liu, Jason Wang
  Cc: Parav Pandit, mst, Eli Cohen, netdev, xieyongji, gautam.dawar,
	virtualization



On 8/3/2022 9:26 AM, Si-Wei Liu wrote:
>
>
> On 8/1/2022 11:33 PM, Jason Wang wrote:
>> On Tue, Aug 2, 2022 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>>
>>>
>>> On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
>>>>
>>>> On 7/31/2022 9:44 PM, Jason Wang wrote:
>>>>> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>>>>>
>>>>>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>>>>>
>>>>>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>>>>>
>>>>>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>>>>>
>>>>>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>>>>>
>>>>>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>>>>>> Sorry to chime in late in the game. For some reason I couldn't
>>>>>>>>>>>> get to most emails for this discussion (I only subscribed to
>>>>>>>>>>>> the virtualization list), while I was taking off amongst the
>>>>>>>>>>>> past few weeks.
>>>>>>>>>>>>
>>>>>>>>>>>> It looks to me this patch is incomplete. Noted down the way in
>>>>>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>>>>>           features = vdev->config->get_driver_features(vdev);
>>>>>>>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>>>>> VDPA_ATTR_PAD))
>>>>>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>>>>>
>>>>>>>>>>>> Making call to .get_driver_features() doesn't make sense when
>>>>>>>>>>>> feature negotiation isn't complete. Neither should present
>>>>>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>>>>>
>>>>>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
>>>>>>>>>>>> probably should not show before negotiation is done - it
>>>>>>>>>>>> depends on driver features negotiated.
>>>>>>>>>>> I have another patch in this series introduces device_features
>>>>>>>>>>> and will report device_features to the userspace even features
>>>>>>>>>>> negotiation not done. Because the spec says we should allow
>>>>>>>>>>> driver access the config space before FEATURES_OK.
>>>>>>>>>> The config space can be accessed by guest before features_ok
>>>>>>>>>> doesn't necessarily mean the value is valid. You may want to
>>>>>>>>>> double check with Michael for what he quoted earlier:
>>>>>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC,
>>>>>>>>> vDPA kernel should not return a mac to the userspace, there is
>>>>>>>>> not a default value for mac.
>>>>>>>> Then please show us the code, as I can only comment based on your
>>>>>>>> latest (v4) patch and it was not there.. To be honest, I don't
>>>>>>>> understand the motivation and the use cases you have, is it for
>>>>>>>> debugging/monitoring or there's really a use case for live
>>>>>>>> migration? For the former, you can do a direct dump on all config
>>>>>>>> space fields regardless of endianess and feature negotiation
>>>>>>>> without having to worry about validity (meaningful to present to
>>>>>>>> admin user). To me these are conflict asks that is impossible to
>>>>>>>> mix in exact one command.
>>>>>>> This bug just has been revealed two days, and you will see the
>>>>>>> patch soon.
>>>>>>>
>>>>>>> There are something to clarify:
>>>>>>> 1) we need to read the device features, or how can you pick a
>>>>>>> proper LM destination
>>>>>
>>>>> So it's probably not very efficient to use this, the manager layer
>>>>> should have the knowledge about the compatibility before doing
>>>>> migration other than try-and-fail.
>>>>>
>>>>> And it's the task of the management to gather the nodes whose devices
>>>>> could be live migrated to each other as something like "cluster"
>>>>> which we've already used in the case of cpuflags.
>>>>>
>>>>> 1) during node bootstrap, the capability of each node and devices was
>>>>> reported to management layer
>>>>> 2) management layer decide the cluster and make sure the migration
>>>>> can only done among the nodes insides the cluster
>>>>> 3) before migration, the vDPA needs to be provisioned on the 
>>>>> destination
>>>>>
>>>>>
>>>>>>> 2) vdpa dev config show can show both device features and driver
>>>>>>> features, there just need a patch for iproute2
>>>>>>> 3) To process information like MQ, we don't just dump the config
>>>>>>> space, MST has explained before
>>>>>> So, it's for live migration... Then why not export those config
>>>>>> parameters specified for vdpa creation (as well as device feature
>>>>>> bits) to the output of "vdpa dev show" command? That's where device
>>>>>> side config lives and is static across vdpa's life cycle. "vdpa dev
>>>>>> config show" is mostly for dynamic driver side config, and the
>>>>>> validity is subject to feature negotiation. I suppose this should
>>>>>> suit your need of LM, e.g.
>>>>>
>>>>> I think so.
>>>>>
>>>>>
>>>>>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 
>>>>>> 2000
>>>>>> $ vdpa dev show vdpa1
>>>>>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs
>>>>>> 15 max_vq_size 256
>>>>>>    max_vqp 7 mtu 2000
>>>>>>    dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS
>>>>>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>>>
>>>>> Note that the mgmt should know this destination have those
>>>>> capability/features before the provisioning.
>>>> Yes, mgmt software should have to check the above from source.
>>> On destination mgmt software can run below to check vdpa mgmtdev's
>>> capability/features:
>>>
>>> $ vdpa mgmtdev show pci/0000:41:04.3
>>> pci/0000:41:04.3:
>>>     supported_classes net
>>>     max_supported_vqs 257
>>>     dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ
>>> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>> Right and this is probably better to be done at node bootstrapping for
>> the management to know about the cluster.
> Exactly. That's what mgmt software is supposed to do typically.
I think this could apply to both mgmt devices and vDPA devices:
1)mgmt device, see whether the mgmt device is capable to create a vDPA 
device with a certain feature bits, this is for LM
2)vDPA device, report the device features, it is for normal operation

Thanks,
Zhu Lingshan
>
> Thanks,
> -Siwei
>
>>
>> Thanks
>>
>>>>>
>>>>>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to
>>>>>> _vdpa_register_device() during registration, and get it saved there
>>>>>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field
>>>>>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>>>>>
>>>>>> Thanks,
>>>>>> -Siwei
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>>> Thanks
>>>>>>> Zhu Lingshan
>>>>>>>>>>> Nope:
>>>>>>>>>>>
>>>>>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>>>>>
>>>>>>>>>>> ...
>>>>>>>>>>>
>>>>>>>>>>> For optional configuration space fields, the driver MUST check
>>>>>>>>>>> that the corresponding feature is offered
>>>>>>>>>>> before accessing that part of the configuration space.
>>>>>>>>>> and how many driver bugs taking wrong assumption of the validity
>>>>>>>>>> of config space field without features_ok. I am not sure what
>>>>>>>>>> use case you want to expose config resister values for before
>>>>>>>>>> features_ok, if it's mostly for live migration I guess it's
>>>>>>>>>> probably heading a wrong direction.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Last but not the least, this "vdpa dev config" command was not
>>>>>>>>>>>> designed to display the real config space register values in
>>>>>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>>>>>
>>>>>>>>>>>>> vdpa dev config show - Show configuration of specific device
>>>>>>>>>>>>> or all devices.
>>>>>>>>>>>>> DEV - specifies the vdpa device to show its configuration. If
>>>>>>>>>>>>> this argument is omitted all devices configuration is listed.
>>>>>>>>>>>> It doesn't say anything about configuration space or register
>>>>>>>>>>>> values in config space. As long as it can convey the config
>>>>>>>>>>>> attribute when instantiating vDPA device instance, and more
>>>>>>>>>>>> importantly, the config can be easily imported from or
>>>>>>>>>>>> exported to userspace tools when trying to reconstruct vdpa
>>>>>>>>>>>> instance intact on destination host for live migration, IMHO
>>>>>>>>>>>> in my personal interpretation it doesn't matter what the
>>>>>>>>>>>> config space may present. It may be worth while adding a new
>>>>>>>>>>>> debug command to expose the real register value, but that's
>>>>>>>>>>>> another story.
>>>>>>>>>>> I am not sure getting your points. vDPA now reports device
>>>>>>>>>>> feature bits(device_features) and negotiated feature
>>>>>>>>>>> bits(driver_features), and yes, the drivers features can be a
>>>>>>>>>>> subset of the device features; and the vDPA device features can
>>>>>>>>>>> be a subset of the management device features.
>>>>>>>>>> What I said is after unblocking the conditional check, you'd
>>>>>>>>>> have to handle the case for each of the vdpa attribute when
>>>>>>>>>> feature negotiation is not yet done: basically the register
>>>>>>>>>> values you got from config space via the
>>>>>>>>>> vdpa_get_config_unlocked() call is not considered to be valid
>>>>>>>>>> before features_ok (per-spec). Although in some case you may get
>>>>>>>>>> sane value, such behavior is generally undefined. If you desire
>>>>>>>>>> to show just the device_features alone without any config space
>>>>>>>>>> field, which the device had advertised *before feature
>>>>>>>>>> negotiation is complete*, that'll be fine. But looks to me this
>>>>>>>>>> is not how patch has been implemented. Probably need some more
>>>>>>>>>> work?
>>>>>>>>> They are driver_features(negotiated) and the
>>>>>>>>> device_features(which comes with the device), and the config
>>>>>>>>> space fields that depend on them. In this series, we report both
>>>>>>>>> to the userspace.
>>>>>>>> I fail to understand what you want to present from your
>>>>>>>> description. May be worth showing some example outputs that at
>>>>>>>> least include the following cases: 1) when device offers features
>>>>>>>> but not yet acknowledge by guest 2) when guest acknowledged
>>>>>>>> features and device is yet to accept 3) after guest feature
>>>>>>>> negotiation is completed (agreed upon between guest and device).
>>>>>>> Only two feature sets: 1) what the device has. (2) what is 
>>>>>>> negotiated
>>>>>>>> Thanks,
>>>>>>>> -Siwei
>>>>>>>>>> Regards,
>>>>>>>>>> -Siwei
>>>>>>>>>>
>>>>>>>>>>>> Having said, please consider to drop the Fixes tag, as appears
>>>>>>>>>>>> to me you're proposing a new feature rather than fixing a real
>>>>>>>>>>>> issue.
>>>>>>>>>>> it's a new feature to report the device feature bits than only
>>>>>>>>>>> negotiated features, however this patch is a must, or it will
>>>>>>>>>>> block the device feature bits reporting. but I agree, the fix
>>>>>>>>>>> tag is not a must.
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> -Siwei
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Users may want to query the config space of a vDPA device,
>>>>>>>>>>>>>> to choose a
>>>>>>>>>>>>>> appropriate one for a certain guest. This means the users
>>>>>>>>>>>>>> need to read the
>>>>>>>>>>>>>> config space before FEATURES_OK, and the existence of config
>>>>>>>>>>>>>> space
>>>>>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The spec says:
>>>>>>>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>>>>>>>> configuration field
>>>>>>>>>>>>>> before FEATURES_OK is set by the driver. This includes
>>>>>>>>>>>>>> fields which are
>>>>>>>>>>>>>> conditional on feature bits, as long as those feature bits
>>>>>>>>>>>>>> are offered by the
>>>>>>>>>>>>>> device.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only
>>>>>>>>>>>>>> if FEATURES_OK)
>>>>>>>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>>>>>> And
>>>>>>>>>>>>> It should be in format
>>>>>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if
>>>>>>>>>>>>> FEATURES_OK")
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please use checkpatch.pl script before posting the patches to
>>>>>>>>>>>>> catch these errors.
>>>>>>>>>>>>> There is a bot that looks at the fixes tag and identifies the
>>>>>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device
>>>>>>>>>>>>>> *vdev,
>>>>>>>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>>>>>>>        u32 device_id;
>>>>>>>>>>>>>>        void *hdr;
>>>>>>>>>>>>>> -    u8 status;
>>>>>>>>>>>>>>        int err;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> down_read(&vdev->cf_lock);
>>>>>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation 
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>> completed");
>>>>>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>>>>>> -        goto out;
>>>>>>>>>>>>>> -    }
>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family,
>>>>>>>>>>>>>> flags,
>>>>>>>>>>>>>> VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>>>>>        if (!hdr) {
>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>> 2.31.1
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Virtualization mailing list
>>>>>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$ 
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-03  2:30                             ` Zhu, Lingshan
@ 2022-08-03 23:09                               ` Si-Wei Liu
  2022-08-04  1:41                                 ` Zhu, Lingshan
  2022-08-04  1:41                                 ` Zhu, Lingshan
  0 siblings, 2 replies; 113+ messages in thread
From: Si-Wei Liu @ 2022-08-03 23:09 UTC (permalink / raw)
  To: Zhu, Lingshan, Jason Wang
  Cc: Parav Pandit, mst, Eli Cohen, netdev, xieyongji, gautam.dawar,
	virtualization



On 8/2/2022 7:30 PM, Zhu, Lingshan wrote:
>
>
> On 8/3/2022 9:26 AM, Si-Wei Liu wrote:
>>
>>
>> On 8/1/2022 11:33 PM, Jason Wang wrote:
>>> On Tue, Aug 2, 2022 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> 
>>> wrote:
>>>>
>>>>
>>>> On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
>>>>>
>>>>> On 7/31/2022 9:44 PM, Jason Wang wrote:
>>>>>> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>>>>>>
>>>>>>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>>>>>>
>>>>>>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>>>>>>
>>>>>>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>>>>>>
>>>>>>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>>>>>>> Sorry to chime in late in the game. For some reason I 
>>>>>>>>>>>>> couldn't
>>>>>>>>>>>>> get to most emails for this discussion (I only subscribed to
>>>>>>>>>>>>> the virtualization list), while I was taking off amongst the
>>>>>>>>>>>>> past few weeks.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It looks to me this patch is incomplete. Noted down the 
>>>>>>>>>>>>> way in
>>>>>>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>>>>>>           features = vdev->config->get_driver_features(vdev);
>>>>>>>>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>>>>>> VDPA_ATTR_PAD))
>>>>>>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>>>>>>
>>>>>>>>>>>>> Making call to .get_driver_features() doesn't make sense when
>>>>>>>>>>>>> feature negotiation isn't complete. Neither should present
>>>>>>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
>>>>>>>>>>>>> probably should not show before negotiation is done - it
>>>>>>>>>>>>> depends on driver features negotiated.
>>>>>>>>>>>> I have another patch in this series introduces device_features
>>>>>>>>>>>> and will report device_features to the userspace even features
>>>>>>>>>>>> negotiation not done. Because the spec says we should allow
>>>>>>>>>>>> driver access the config space before FEATURES_OK.
>>>>>>>>>>> The config space can be accessed by guest before features_ok
>>>>>>>>>>> doesn't necessarily mean the value is valid. You may want to
>>>>>>>>>>> double check with Michael for what he quoted earlier:
>>>>>>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC,
>>>>>>>>>> vDPA kernel should not return a mac to the userspace, there is
>>>>>>>>>> not a default value for mac.
>>>>>>>>> Then please show us the code, as I can only comment based on your
>>>>>>>>> latest (v4) patch and it was not there.. To be honest, I don't
>>>>>>>>> understand the motivation and the use cases you have, is it for
>>>>>>>>> debugging/monitoring or there's really a use case for live
>>>>>>>>> migration? For the former, you can do a direct dump on all config
>>>>>>>>> space fields regardless of endianess and feature negotiation
>>>>>>>>> without having to worry about validity (meaningful to present to
>>>>>>>>> admin user). To me these are conflict asks that is impossible to
>>>>>>>>> mix in exact one command.
>>>>>>>> This bug just has been revealed two days, and you will see the
>>>>>>>> patch soon.
>>>>>>>>
>>>>>>>> There are something to clarify:
>>>>>>>> 1) we need to read the device features, or how can you pick a
>>>>>>>> proper LM destination
>>>>>>
>>>>>> So it's probably not very efficient to use this, the manager layer
>>>>>> should have the knowledge about the compatibility before doing
>>>>>> migration other than try-and-fail.
>>>>>>
>>>>>> And it's the task of the management to gather the nodes whose 
>>>>>> devices
>>>>>> could be live migrated to each other as something like "cluster"
>>>>>> which we've already used in the case of cpuflags.
>>>>>>
>>>>>> 1) during node bootstrap, the capability of each node and devices 
>>>>>> was
>>>>>> reported to management layer
>>>>>> 2) management layer decide the cluster and make sure the migration
>>>>>> can only done among the nodes insides the cluster
>>>>>> 3) before migration, the vDPA needs to be provisioned on the 
>>>>>> destination
>>>>>>
>>>>>>
>>>>>>>> 2) vdpa dev config show can show both device features and driver
>>>>>>>> features, there just need a patch for iproute2
>>>>>>>> 3) To process information like MQ, we don't just dump the config
>>>>>>>> space, MST has explained before
>>>>>>> So, it's for live migration... Then why not export those config
>>>>>>> parameters specified for vdpa creation (as well as device feature
>>>>>>> bits) to the output of "vdpa dev show" command? That's where device
>>>>>>> side config lives and is static across vdpa's life cycle. "vdpa dev
>>>>>>> config show" is mostly for dynamic driver side config, and the
>>>>>>> validity is subject to feature negotiation. I suppose this should
>>>>>>> suit your need of LM, e.g.
>>>>>>
>>>>>> I think so.
>>>>>>
>>>>>>
>>>>>>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 
>>>>>>> 2000
>>>>>>> $ vdpa dev show vdpa1
>>>>>>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs
>>>>>>> 15 max_vq_size 256
>>>>>>>    max_vqp 7 mtu 2000
>>>>>>>    dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS
>>>>>>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>>>>
>>>>>> Note that the mgmt should know this destination have those
>>>>>> capability/features before the provisioning.
>>>>> Yes, mgmt software should have to check the above from source.
>>>> On destination mgmt software can run below to check vdpa mgmtdev's
>>>> capability/features:
>>>>
>>>> $ vdpa mgmtdev show pci/0000:41:04.3
>>>> pci/0000:41:04.3:
>>>>     supported_classes net
>>>>     max_supported_vqs 257
>>>>     dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS 
>>>> CTRL_VQ
>>>> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>> Right and this is probably better to be done at node bootstrapping for
>>> the management to know about the cluster.
>> Exactly. That's what mgmt software is supposed to do typically.
> I think this could apply to both mgmt devices and vDPA devices:
> 1)mgmt device, see whether the mgmt device is capable to create a vDPA 
> device with a certain feature bits, this is for LM
> 2)vDPA device, report the device features, it is for normal operation
Can you elaborate the use case "for normal operations"? Then it has 
nothing to do with LM for sure, correct?

Noted for the LM case, just as Jason indicated, it's not even *required* 
for the mgmt software to gather the device features through "vdpa dev 
show" on source host *alive* right before live migration is started. 
Depending on the way how it is implemented, the mgmt software can well 
collect device capability on boot strap time, or may well save the vdpa 
device capability/config in persistent store ahead of time, say before 
any VM is to be launched. Then with all such info collected for each 
cluster node, mgmt software is able to get its own way to infer and sort 
out the live migration compatibility between nodes. I'm not sure which 
case you would need to check the device features, but in case you need 
it, it'd be better live in "vdpa dev show" than "vdpa dev config show".

Thanks,
-Siwei

>
> Thanks,
> Zhu Lingshan
>>
>> Thanks,
>> -Siwei
>>
>>>
>>> Thanks
>>>
>>>>>>
>>>>>>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to
>>>>>>> _vdpa_register_device() during registration, and get it saved there
>>>>>>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field
>>>>>>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>>>>>>
>>>>>>> Thanks,
>>>>>>> -Siwei
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>>> Thanks
>>>>>>>> Zhu Lingshan
>>>>>>>>>>>> Nope:
>>>>>>>>>>>>
>>>>>>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>>>>>>
>>>>>>>>>>>> ...
>>>>>>>>>>>>
>>>>>>>>>>>> For optional configuration space fields, the driver MUST check
>>>>>>>>>>>> that the corresponding feature is offered
>>>>>>>>>>>> before accessing that part of the configuration space.
>>>>>>>>>>> and how many driver bugs taking wrong assumption of the 
>>>>>>>>>>> validity
>>>>>>>>>>> of config space field without features_ok. I am not sure what
>>>>>>>>>>> use case you want to expose config resister values for before
>>>>>>>>>>> features_ok, if it's mostly for live migration I guess it's
>>>>>>>>>>> probably heading a wrong direction.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Last but not the least, this "vdpa dev config" command was 
>>>>>>>>>>>>> not
>>>>>>>>>>>>> designed to display the real config space register values in
>>>>>>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> vdpa dev config show - Show configuration of specific device
>>>>>>>>>>>>>> or all devices.
>>>>>>>>>>>>>> DEV - specifies the vdpa device to show its 
>>>>>>>>>>>>>> configuration. If
>>>>>>>>>>>>>> this argument is omitted all devices configuration is 
>>>>>>>>>>>>>> listed.
>>>>>>>>>>>>> It doesn't say anything about configuration space or register
>>>>>>>>>>>>> values in config space. As long as it can convey the config
>>>>>>>>>>>>> attribute when instantiating vDPA device instance, and more
>>>>>>>>>>>>> importantly, the config can be easily imported from or
>>>>>>>>>>>>> exported to userspace tools when trying to reconstruct vdpa
>>>>>>>>>>>>> instance intact on destination host for live migration, IMHO
>>>>>>>>>>>>> in my personal interpretation it doesn't matter what the
>>>>>>>>>>>>> config space may present. It may be worth while adding a new
>>>>>>>>>>>>> debug command to expose the real register value, but that's
>>>>>>>>>>>>> another story.
>>>>>>>>>>>> I am not sure getting your points. vDPA now reports device
>>>>>>>>>>>> feature bits(device_features) and negotiated feature
>>>>>>>>>>>> bits(driver_features), and yes, the drivers features can be a
>>>>>>>>>>>> subset of the device features; and the vDPA device features 
>>>>>>>>>>>> can
>>>>>>>>>>>> be a subset of the management device features.
>>>>>>>>>>> What I said is after unblocking the conditional check, you'd
>>>>>>>>>>> have to handle the case for each of the vdpa attribute when
>>>>>>>>>>> feature negotiation is not yet done: basically the register
>>>>>>>>>>> values you got from config space via the
>>>>>>>>>>> vdpa_get_config_unlocked() call is not considered to be valid
>>>>>>>>>>> before features_ok (per-spec). Although in some case you may 
>>>>>>>>>>> get
>>>>>>>>>>> sane value, such behavior is generally undefined. If you desire
>>>>>>>>>>> to show just the device_features alone without any config space
>>>>>>>>>>> field, which the device had advertised *before feature
>>>>>>>>>>> negotiation is complete*, that'll be fine. But looks to me this
>>>>>>>>>>> is not how patch has been implemented. Probably need some more
>>>>>>>>>>> work?
>>>>>>>>>> They are driver_features(negotiated) and the
>>>>>>>>>> device_features(which comes with the device), and the config
>>>>>>>>>> space fields that depend on them. In this series, we report both
>>>>>>>>>> to the userspace.
>>>>>>>>> I fail to understand what you want to present from your
>>>>>>>>> description. May be worth showing some example outputs that at
>>>>>>>>> least include the following cases: 1) when device offers features
>>>>>>>>> but not yet acknowledge by guest 2) when guest acknowledged
>>>>>>>>> features and device is yet to accept 3) after guest feature
>>>>>>>>> negotiation is completed (agreed upon between guest and device).
>>>>>>>> Only two feature sets: 1) what the device has. (2) what is 
>>>>>>>> negotiated
>>>>>>>>> Thanks,
>>>>>>>>> -Siwei
>>>>>>>>>>> Regards,
>>>>>>>>>>> -Siwei
>>>>>>>>>>>
>>>>>>>>>>>>> Having said, please consider to drop the Fixes tag, as 
>>>>>>>>>>>>> appears
>>>>>>>>>>>>> to me you're proposing a new feature rather than fixing a 
>>>>>>>>>>>>> real
>>>>>>>>>>>>> issue.
>>>>>>>>>>>> it's a new feature to report the device feature bits than only
>>>>>>>>>>>> negotiated features, however this patch is a must, or it will
>>>>>>>>>>>> block the device feature bits reporting. but I agree, the fix
>>>>>>>>>>>> tag is not a must.
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> -Siwei
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Users may want to query the config space of a vDPA device,
>>>>>>>>>>>>>>> to choose a
>>>>>>>>>>>>>>> appropriate one for a certain guest. This means the users
>>>>>>>>>>>>>>> need to read the
>>>>>>>>>>>>>>> config space before FEATURES_OK, and the existence of 
>>>>>>>>>>>>>>> config
>>>>>>>>>>>>>>> space
>>>>>>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The spec says:
>>>>>>>>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>>>>>>>>> configuration field
>>>>>>>>>>>>>>> before FEATURES_OK is set by the driver. This includes
>>>>>>>>>>>>>>> fields which are
>>>>>>>>>>>>>>> conditional on feature bits, as long as those feature bits
>>>>>>>>>>>>>>> are offered by the
>>>>>>>>>>>>>>> device.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only
>>>>>>>>>>>>>>> if FEATURES_OK)
>>>>>>>>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>>>>>>> And
>>>>>>>>>>>>>> It should be in format
>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration 
>>>>>>>>>>>>>> only if
>>>>>>>>>>>>>> FEATURES_OK")
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please use checkpatch.pl script before posting the 
>>>>>>>>>>>>>> patches to
>>>>>>>>>>>>>> catch these errors.
>>>>>>>>>>>>>> There is a bot that looks at the fixes tag and identifies 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c 
>>>>>>>>>>>>>>> index
>>>>>>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct 
>>>>>>>>>>>>>>> vdpa_device
>>>>>>>>>>>>>>> *vdev,
>>>>>>>>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>>>>>>>>        u32 device_id;
>>>>>>>>>>>>>>>        void *hdr;
>>>>>>>>>>>>>>> -    u8 status;
>>>>>>>>>>>>>>>        int err;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> down_read(&vdev->cf_lock);
>>>>>>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features 
>>>>>>>>>>>>>>> negotiation not
>>>>>>>>>>>>>>> completed");
>>>>>>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>>>>>>> -        goto out;
>>>>>>>>>>>>>>> -    }
>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family,
>>>>>>>>>>>>>>> flags,
>>>>>>>>>>>>>>> VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>>>>>>        if (!hdr) {
>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>> 2.31.1
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Virtualization mailing list
>>>>>>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$ 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-03 23:09                               ` Si-Wei Liu
@ 2022-08-04  1:41                                 ` Zhu, Lingshan
  2022-08-04  1:41                                 ` Zhu, Lingshan
  1 sibling, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-08-04  1:41 UTC (permalink / raw)
  To: Si-Wei Liu, Jason Wang
  Cc: Parav Pandit, mst, Eli Cohen, netdev, xieyongji, gautam.dawar,
	virtualization



On 8/4/2022 7:09 AM, Si-Wei Liu wrote:
>
>
> On 8/2/2022 7:30 PM, Zhu, Lingshan wrote:
>>
>>
>> On 8/3/2022 9:26 AM, Si-Wei Liu wrote:
>>>
>>>
>>> On 8/1/2022 11:33 PM, Jason Wang wrote:
>>>> On Tue, Aug 2, 2022 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> 
>>>> wrote:
>>>>>
>>>>>
>>>>> On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
>>>>>>
>>>>>> On 7/31/2022 9:44 PM, Jason Wang wrote:
>>>>>>> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>>>>>>>
>>>>>>>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>>>>>>>
>>>>>>>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>>>>>>>
>>>>>>>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>>>>>>>> Sorry to chime in late in the game. For some reason I 
>>>>>>>>>>>>>> couldn't
>>>>>>>>>>>>>> get to most emails for this discussion (I only subscribed to
>>>>>>>>>>>>>> the virtualization list), while I was taking off amongst the
>>>>>>>>>>>>>> past few weeks.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It looks to me this patch is incomplete. Noted down the 
>>>>>>>>>>>>>> way in
>>>>>>>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>>>>>>>           features = 
>>>>>>>>>>>>>> vdev->config->get_driver_features(vdev);
>>>>>>>>>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>>>>>>> VDPA_ATTR_PAD))
>>>>>>>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Making call to .get_driver_features() doesn't make sense 
>>>>>>>>>>>>>> when
>>>>>>>>>>>>>> feature negotiation isn't complete. Neither should present
>>>>>>>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
>>>>>>>>>>>>>> probably should not show before negotiation is done - it
>>>>>>>>>>>>>> depends on driver features negotiated.
>>>>>>>>>>>>> I have another patch in this series introduces 
>>>>>>>>>>>>> device_features
>>>>>>>>>>>>> and will report device_features to the userspace even 
>>>>>>>>>>>>> features
>>>>>>>>>>>>> negotiation not done. Because the spec says we should allow
>>>>>>>>>>>>> driver access the config space before FEATURES_OK.
>>>>>>>>>>>> The config space can be accessed by guest before features_ok
>>>>>>>>>>>> doesn't necessarily mean the value is valid. You may want to
>>>>>>>>>>>> double check with Michael for what he quoted earlier:
>>>>>>>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC,
>>>>>>>>>>> vDPA kernel should not return a mac to the userspace, there is
>>>>>>>>>>> not a default value for mac.
>>>>>>>>>> Then please show us the code, as I can only comment based on 
>>>>>>>>>> your
>>>>>>>>>> latest (v4) patch and it was not there.. To be honest, I don't
>>>>>>>>>> understand the motivation and the use cases you have, is it for
>>>>>>>>>> debugging/monitoring or there's really a use case for live
>>>>>>>>>> migration? For the former, you can do a direct dump on all 
>>>>>>>>>> config
>>>>>>>>>> space fields regardless of endianess and feature negotiation
>>>>>>>>>> without having to worry about validity (meaningful to present to
>>>>>>>>>> admin user). To me these are conflict asks that is impossible to
>>>>>>>>>> mix in exact one command.
>>>>>>>>> This bug just has been revealed two days, and you will see the
>>>>>>>>> patch soon.
>>>>>>>>>
>>>>>>>>> There are something to clarify:
>>>>>>>>> 1) we need to read the device features, or how can you pick a
>>>>>>>>> proper LM destination
>>>>>>>
>>>>>>> So it's probably not very efficient to use this, the manager layer
>>>>>>> should have the knowledge about the compatibility before doing
>>>>>>> migration other than try-and-fail.
>>>>>>>
>>>>>>> And it's the task of the management to gather the nodes whose 
>>>>>>> devices
>>>>>>> could be live migrated to each other as something like "cluster"
>>>>>>> which we've already used in the case of cpuflags.
>>>>>>>
>>>>>>> 1) during node bootstrap, the capability of each node and 
>>>>>>> devices was
>>>>>>> reported to management layer
>>>>>>> 2) management layer decide the cluster and make sure the migration
>>>>>>> can only done among the nodes insides the cluster
>>>>>>> 3) before migration, the vDPA needs to be provisioned on the 
>>>>>>> destination
>>>>>>>
>>>>>>>
>>>>>>>>> 2) vdpa dev config show can show both device features and driver
>>>>>>>>> features, there just need a patch for iproute2
>>>>>>>>> 3) To process information like MQ, we don't just dump the config
>>>>>>>>> space, MST has explained before
>>>>>>>> So, it's for live migration... Then why not export those config
>>>>>>>> parameters specified for vdpa creation (as well as device feature
>>>>>>>> bits) to the output of "vdpa dev show" command? That's where 
>>>>>>>> device
>>>>>>>> side config lives and is static across vdpa's life cycle. "vdpa 
>>>>>>>> dev
>>>>>>>> config show" is mostly for dynamic driver side config, and the
>>>>>>>> validity is subject to feature negotiation. I suppose this should
>>>>>>>> suit your need of LM, e.g.
>>>>>>>
>>>>>>> I think so.
>>>>>>>
>>>>>>>
>>>>>>>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 
>>>>>>>> mtu 2000
>>>>>>>> $ vdpa dev show vdpa1
>>>>>>>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 
>>>>>>>> max_vqs
>>>>>>>> 15 max_vq_size 256
>>>>>>>>    max_vqp 7 mtu 2000
>>>>>>>>    dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS
>>>>>>>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>>>>>
>>>>>>> Note that the mgmt should know this destination have those
>>>>>>> capability/features before the provisioning.
>>>>>> Yes, mgmt software should have to check the above from source.
>>>>> On destination mgmt software can run below to check vdpa mgmtdev's
>>>>> capability/features:
>>>>>
>>>>> $ vdpa mgmtdev show pci/0000:41:04.3
>>>>> pci/0000:41:04.3:
>>>>>     supported_classes net
>>>>>     max_supported_vqs 257
>>>>>     dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS 
>>>>> CTRL_VQ
>>>>> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>> Right and this is probably better to be done at node bootstrapping for
>>>> the management to know about the cluster.
>>> Exactly. That's what mgmt software is supposed to do typically.
>> I think this could apply to both mgmt devices and vDPA devices:
>> 1)mgmt device, see whether the mgmt device is capable to create a 
>> vDPA device with a certain feature bits, this is for LM
>> 2)vDPA device, report the device features, it is for normal operation
> Can you elaborate the use case "for normal operations"? Then it has 
> nothing to do with LM for sure, correct?
like when you just want to check the features to pick a proper device
>
> Noted for the LM case, just as Jason indicated, it's not even 
> *required* for the mgmt software to gather the device features through 
> "vdpa dev show" on source host *alive* right before live migration is 
> started. Depending on the way how it is implemented, the mgmt software 
> can well collect device capability on boot strap time, or may well 
> save the vdpa device capability/config in persistent store ahead of 
> time, say before any VM is to be launched. Then with all such info 
> collected for each cluster node, mgmt software is able to get its own 
> way to infer and sort out the live migration compatibility between 
> nodes. I'm not sure which case you would need to check the device 
> features, but in case you need it, it'd be better live in "vdpa dev 
> show" than "vdpa dev config show".
it not only for LM
>
> Thanks,
> -Siwei
>
>>
>> Thanks,
>> Zhu Lingshan
>>>
>>> Thanks,
>>> -Siwei
>>>
>>>>
>>>> Thanks
>>>>
>>>>>>>
>>>>>>>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to
>>>>>>>> _vdpa_register_device() during registration, and get it saved 
>>>>>>>> there
>>>>>>>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field
>>>>>>>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> -Siwei
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Zhu Lingshan
>>>>>>>>>>>>> Nope:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>>>>>>>
>>>>>>>>>>>>> ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> For optional configuration space fields, the driver MUST 
>>>>>>>>>>>>> check
>>>>>>>>>>>>> that the corresponding feature is offered
>>>>>>>>>>>>> before accessing that part of the configuration space.
>>>>>>>>>>>> and how many driver bugs taking wrong assumption of the 
>>>>>>>>>>>> validity
>>>>>>>>>>>> of config space field without features_ok. I am not sure what
>>>>>>>>>>>> use case you want to expose config resister values for before
>>>>>>>>>>>> features_ok, if it's mostly for live migration I guess it's
>>>>>>>>>>>> probably heading a wrong direction.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Last but not the least, this "vdpa dev config" command 
>>>>>>>>>>>>>> was not
>>>>>>>>>>>>>> designed to display the real config space register values in
>>>>>>>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> vdpa dev config show - Show configuration of specific 
>>>>>>>>>>>>>>> device
>>>>>>>>>>>>>>> or all devices.
>>>>>>>>>>>>>>> DEV - specifies the vdpa device to show its 
>>>>>>>>>>>>>>> configuration. If
>>>>>>>>>>>>>>> this argument is omitted all devices configuration is 
>>>>>>>>>>>>>>> listed.
>>>>>>>>>>>>>> It doesn't say anything about configuration space or 
>>>>>>>>>>>>>> register
>>>>>>>>>>>>>> values in config space. As long as it can convey the config
>>>>>>>>>>>>>> attribute when instantiating vDPA device instance, and more
>>>>>>>>>>>>>> importantly, the config can be easily imported from or
>>>>>>>>>>>>>> exported to userspace tools when trying to reconstruct vdpa
>>>>>>>>>>>>>> instance intact on destination host for live migration, IMHO
>>>>>>>>>>>>>> in my personal interpretation it doesn't matter what the
>>>>>>>>>>>>>> config space may present. It may be worth while adding a new
>>>>>>>>>>>>>> debug command to expose the real register value, but that's
>>>>>>>>>>>>>> another story.
>>>>>>>>>>>>> I am not sure getting your points. vDPA now reports device
>>>>>>>>>>>>> feature bits(device_features) and negotiated feature
>>>>>>>>>>>>> bits(driver_features), and yes, the drivers features can be a
>>>>>>>>>>>>> subset of the device features; and the vDPA device 
>>>>>>>>>>>>> features can
>>>>>>>>>>>>> be a subset of the management device features.
>>>>>>>>>>>> What I said is after unblocking the conditional check, you'd
>>>>>>>>>>>> have to handle the case for each of the vdpa attribute when
>>>>>>>>>>>> feature negotiation is not yet done: basically the register
>>>>>>>>>>>> values you got from config space via the
>>>>>>>>>>>> vdpa_get_config_unlocked() call is not considered to be valid
>>>>>>>>>>>> before features_ok (per-spec). Although in some case you 
>>>>>>>>>>>> may get
>>>>>>>>>>>> sane value, such behavior is generally undefined. If you 
>>>>>>>>>>>> desire
>>>>>>>>>>>> to show just the device_features alone without any config 
>>>>>>>>>>>> space
>>>>>>>>>>>> field, which the device had advertised *before feature
>>>>>>>>>>>> negotiation is complete*, that'll be fine. But looks to me 
>>>>>>>>>>>> this
>>>>>>>>>>>> is not how patch has been implemented. Probably need some more
>>>>>>>>>>>> work?
>>>>>>>>>>> They are driver_features(negotiated) and the
>>>>>>>>>>> device_features(which comes with the device), and the config
>>>>>>>>>>> space fields that depend on them. In this series, we report 
>>>>>>>>>>> both
>>>>>>>>>>> to the userspace.
>>>>>>>>>> I fail to understand what you want to present from your
>>>>>>>>>> description. May be worth showing some example outputs that at
>>>>>>>>>> least include the following cases: 1) when device offers 
>>>>>>>>>> features
>>>>>>>>>> but not yet acknowledge by guest 2) when guest acknowledged
>>>>>>>>>> features and device is yet to accept 3) after guest feature
>>>>>>>>>> negotiation is completed (agreed upon between guest and device).
>>>>>>>>> Only two feature sets: 1) what the device has. (2) what is 
>>>>>>>>> negotiated
>>>>>>>>>> Thanks,
>>>>>>>>>> -Siwei
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> -Siwei
>>>>>>>>>>>>
>>>>>>>>>>>>>> Having said, please consider to drop the Fixes tag, as 
>>>>>>>>>>>>>> appears
>>>>>>>>>>>>>> to me you're proposing a new feature rather than fixing a 
>>>>>>>>>>>>>> real
>>>>>>>>>>>>>> issue.
>>>>>>>>>>>>> it's a new feature to report the device feature bits than 
>>>>>>>>>>>>> only
>>>>>>>>>>>>> negotiated features, however this patch is a must, or it will
>>>>>>>>>>>>> block the device feature bits reporting. but I agree, the fix
>>>>>>>>>>>>> tag is not a must.
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> -Siwei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Users may want to query the config space of a vDPA device,
>>>>>>>>>>>>>>>> to choose a
>>>>>>>>>>>>>>>> appropriate one for a certain guest. This means the users
>>>>>>>>>>>>>>>> need to read the
>>>>>>>>>>>>>>>> config space before FEATURES_OK, and the existence of 
>>>>>>>>>>>>>>>> config
>>>>>>>>>>>>>>>> space
>>>>>>>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The spec says:
>>>>>>>>>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>>>>>>>>>> configuration field
>>>>>>>>>>>>>>>> before FEATURES_OK is set by the driver. This includes
>>>>>>>>>>>>>>>> fields which are
>>>>>>>>>>>>>>>> conditional on feature bits, as long as those feature bits
>>>>>>>>>>>>>>>> are offered by the
>>>>>>>>>>>>>>>> device.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only
>>>>>>>>>>>>>>>> if FEATURES_OK)
>>>>>>>>>>>>>>> Fix is fine, but fixes tag needs correction described 
>>>>>>>>>>>>>>> below.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>>>>>>>> And
>>>>>>>>>>>>>>> It should be in format
>>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration 
>>>>>>>>>>>>>>> only if
>>>>>>>>>>>>>>> FEATURES_OK")
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please use checkpatch.pl script before posting the 
>>>>>>>>>>>>>>> patches to
>>>>>>>>>>>>>>> catch these errors.
>>>>>>>>>>>>>>> There is a bot that looks at the fixes tag and 
>>>>>>>>>>>>>>> identifies the
>>>>>>>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c 
>>>>>>>>>>>>>>>> index
>>>>>>>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct 
>>>>>>>>>>>>>>>> vdpa_device
>>>>>>>>>>>>>>>> *vdev,
>>>>>>>>>>>>>>>> struct sk_buff *msg, u32 portid, {
>>>>>>>>>>>>>>>>        u32 device_id;
>>>>>>>>>>>>>>>>        void *hdr;
>>>>>>>>>>>>>>>> -    u8 status;
>>>>>>>>>>>>>>>>        int err;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> down_read(&vdev->cf_lock);
>>>>>>>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>>>>>>>> - NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>>>>>>>>>> completed");
>>>>>>>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>>>>>>>> -        goto out;
>>>>>>>>>>>>>>>> -    }
>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, 
>>>>>>>>>>>>>>>> &vdpa_nl_family,
>>>>>>>>>>>>>>>> flags,
>>>>>>>>>>>>>>>> VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>>>>>>>        if (!hdr) {
>>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>>> 2.31.1
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Virtualization mailing list
>>>>>>>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$ 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>
>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-03 23:09                               ` Si-Wei Liu
  2022-08-04  1:41                                 ` Zhu, Lingshan
@ 2022-08-04  1:41                                 ` Zhu, Lingshan
  1 sibling, 0 replies; 113+ messages in thread
From: Zhu, Lingshan @ 2022-08-04  1:41 UTC (permalink / raw)
  To: Si-Wei Liu, Jason Wang
  Cc: Parav Pandit, mst, Eli Cohen, netdev, xieyongji, gautam.dawar,
	virtualization



On 8/4/2022 7:09 AM, Si-Wei Liu wrote:
>
>
> On 8/2/2022 7:30 PM, Zhu, Lingshan wrote:
>>
>>
>> On 8/3/2022 9:26 AM, Si-Wei Liu wrote:
>>>
>>>
>>> On 8/1/2022 11:33 PM, Jason Wang wrote:
>>>> On Tue, Aug 2, 2022 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> 
>>>> wrote:
>>>>>
>>>>>
>>>>> On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
>>>>>>
>>>>>> On 7/31/2022 9:44 PM, Jason Wang wrote:
>>>>>>> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>>>>>>>
>>>>>>>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>>>>>>>
>>>>>>>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>>>>>>>
>>>>>>>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>>>>>>>> Sorry to chime in late in the game. For some reason I 
>>>>>>>>>>>>>> couldn't
>>>>>>>>>>>>>> get to most emails for this discussion (I only subscribed to
>>>>>>>>>>>>>> the virtualization list), while I was taking off amongst the
>>>>>>>>>>>>>> past few weeks.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It looks to me this patch is incomplete. Noted down the 
>>>>>>>>>>>>>> way in
>>>>>>>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>>>>>>>           features = 
>>>>>>>>>>>>>> vdev->config->get_driver_features(vdev);
>>>>>>>>>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>>>>>>> VDPA_ATTR_PAD))
>>>>>>>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Making call to .get_driver_features() doesn't make sense 
>>>>>>>>>>>>>> when
>>>>>>>>>>>>>> feature negotiation isn't complete. Neither should present
>>>>>>>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
>>>>>>>>>>>>>> probably should not show before negotiation is done - it
>>>>>>>>>>>>>> depends on driver features negotiated.
>>>>>>>>>>>>> I have another patch in this series introduces 
>>>>>>>>>>>>> device_features
>>>>>>>>>>>>> and will report device_features to the userspace even 
>>>>>>>>>>>>> features
>>>>>>>>>>>>> negotiation not done. Because the spec says we should allow
>>>>>>>>>>>>> driver access the config space before FEATURES_OK.
>>>>>>>>>>>> The config space can be accessed by guest before features_ok
>>>>>>>>>>>> doesn't necessarily mean the value is valid. You may want to
>>>>>>>>>>>> double check with Michael for what he quoted earlier:
>>>>>>>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC,
>>>>>>>>>>> vDPA kernel should not return a mac to the userspace, there is
>>>>>>>>>>> not a default value for mac.
>>>>>>>>>> Then please show us the code, as I can only comment based on 
>>>>>>>>>> your
>>>>>>>>>> latest (v4) patch and it was not there.. To be honest, I don't
>>>>>>>>>> understand the motivation and the use cases you have, is it for
>>>>>>>>>> debugging/monitoring or there's really a use case for live
>>>>>>>>>> migration? For the former, you can do a direct dump on all 
>>>>>>>>>> config
>>>>>>>>>> space fields regardless of endianess and feature negotiation
>>>>>>>>>> without having to worry about validity (meaningful to present to
>>>>>>>>>> admin user). To me these are conflict asks that is impossible to
>>>>>>>>>> mix in exact one command.
>>>>>>>>> This bug just has been revealed two days, and you will see the
>>>>>>>>> patch soon.
>>>>>>>>>
>>>>>>>>> There are something to clarify:
>>>>>>>>> 1) we need to read the device features, or how can you pick a
>>>>>>>>> proper LM destination
>>>>>>>
>>>>>>> So it's probably not very efficient to use this, the manager layer
>>>>>>> should have the knowledge about the compatibility before doing
>>>>>>> migration other than try-and-fail.
>>>>>>>
>>>>>>> And it's the task of the management to gather the nodes whose 
>>>>>>> devices
>>>>>>> could be live migrated to each other as something like "cluster"
>>>>>>> which we've already used in the case of cpuflags.
>>>>>>>
>>>>>>> 1) during node bootstrap, the capability of each node and 
>>>>>>> devices was
>>>>>>> reported to management layer
>>>>>>> 2) management layer decide the cluster and make sure the migration
>>>>>>> can only done among the nodes insides the cluster
>>>>>>> 3) before migration, the vDPA needs to be provisioned on the 
>>>>>>> destination
>>>>>>>
>>>>>>>
>>>>>>>>> 2) vdpa dev config show can show both device features and driver
>>>>>>>>> features, there just need a patch for iproute2
>>>>>>>>> 3) To process information like MQ, we don't just dump the config
>>>>>>>>> space, MST has explained before
>>>>>>>> So, it's for live migration... Then why not export those config
>>>>>>>> parameters specified for vdpa creation (as well as device feature
>>>>>>>> bits) to the output of "vdpa dev show" command? That's where 
>>>>>>>> device
>>>>>>>> side config lives and is static across vdpa's life cycle. "vdpa 
>>>>>>>> dev
>>>>>>>> config show" is mostly for dynamic driver side config, and the
>>>>>>>> validity is subject to feature negotiation. I suppose this should
>>>>>>>> suit your need of LM, e.g.
>>>>>>>
>>>>>>> I think so.
>>>>>>>
>>>>>>>
>>>>>>>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 
>>>>>>>> mtu 2000
>>>>>>>> $ vdpa dev show vdpa1
>>>>>>>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 
>>>>>>>> max_vqs
>>>>>>>> 15 max_vq_size 256
>>>>>>>>    max_vqp 7 mtu 2000
>>>>>>>>    dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS
>>>>>>>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>>>>>
>>>>>>> Note that the mgmt should know this destination have those
>>>>>>> capability/features before the provisioning.
>>>>>> Yes, mgmt software should have to check the above from source.
>>>>> On destination mgmt software can run below to check vdpa mgmtdev's
>>>>> capability/features:
>>>>>
>>>>> $ vdpa mgmtdev show pci/0000:41:04.3
>>>>> pci/0000:41:04.3:
>>>>>     supported_classes net
>>>>>     max_supported_vqs 257
>>>>>     dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS 
>>>>> CTRL_VQ
>>>>> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>> Right and this is probably better to be done at node bootstrapping for
>>>> the management to know about the cluster.
>>> Exactly. That's what mgmt software is supposed to do typically.
>> I think this could apply to both mgmt devices and vDPA devices:
>> 1)mgmt device, see whether the mgmt device is capable to create a 
>> vDPA device with a certain feature bits, this is for LM
>> 2)vDPA device, report the device features, it is for normal operation
> Can you elaborate the use case "for normal operations"? Then it has 
> nothing to do with LM for sure, correct?
like when you just want to check the features to pick a proper device
>
> Noted for the LM case, just as Jason indicated, it's not even 
> *required* for the mgmt software to gather the device features through 
> "vdpa dev show" on source host *alive* right before live migration is 
> started. Depending on the way how it is implemented, the mgmt software 
> can well collect device capability on boot strap time, or may well 
> save the vdpa device capability/config in persistent store ahead of 
> time, say before any VM is to be launched. Then with all such info 
> collected for each cluster node, mgmt software is able to get its own 
> way to infer and sort out the live migration compatibility between 
> nodes. I'm not sure which case you would need to check the device 
> features, but in case you need it, it'd be better live in "vdpa dev 
> show" than "vdpa dev config show".
it is not only for LM
>
> Thanks,
> -Siwei
>
>>
>> Thanks,
>> Zhu Lingshan
>>>
>>> Thanks,
>>> -Siwei
>>>
>>>>
>>>> Thanks
>>>>
>>>>>>>
>>>>>>>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to
>>>>>>>> _vdpa_register_device() during registration, and get it saved 
>>>>>>>> there
>>>>>>>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field
>>>>>>>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> -Siwei
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Zhu Lingshan
>>>>>>>>>>>>> Nope:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>>>>>>>
>>>>>>>>>>>>> ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> For optional configuration space fields, the driver MUST 
>>>>>>>>>>>>> check
>>>>>>>>>>>>> that the corresponding feature is offered
>>>>>>>>>>>>> before accessing that part of the configuration space.
>>>>>>>>>>>> and how many driver bugs taking wrong assumption of the 
>>>>>>>>>>>> validity
>>>>>>>>>>>> of config space field without features_ok. I am not sure what
>>>>>>>>>>>> use case you want to expose config resister values for before
>>>>>>>>>>>> features_ok, if it's mostly for live migration I guess it's
>>>>>>>>>>>> probably heading a wrong direction.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Last but not the least, this "vdpa dev config" command 
>>>>>>>>>>>>>> was not
>>>>>>>>>>>>>> designed to display the real config space register values in
>>>>>>>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> vdpa dev config show - Show configuration of specific 
>>>>>>>>>>>>>>> device
>>>>>>>>>>>>>>> or all devices.
>>>>>>>>>>>>>>> DEV - specifies the vdpa device to show its 
>>>>>>>>>>>>>>> configuration. If
>>>>>>>>>>>>>>> this argument is omitted all devices configuration is 
>>>>>>>>>>>>>>> listed.
>>>>>>>>>>>>>> It doesn't say anything about configuration space or 
>>>>>>>>>>>>>> register
>>>>>>>>>>>>>> values in config space. As long as it can convey the config
>>>>>>>>>>>>>> attribute when instantiating vDPA device instance, and more
>>>>>>>>>>>>>> importantly, the config can be easily imported from or
>>>>>>>>>>>>>> exported to userspace tools when trying to reconstruct vdpa
>>>>>>>>>>>>>> instance intact on destination host for live migration, IMHO
>>>>>>>>>>>>>> in my personal interpretation it doesn't matter what the
>>>>>>>>>>>>>> config space may present. It may be worth while adding a new
>>>>>>>>>>>>>> debug command to expose the real register value, but that's
>>>>>>>>>>>>>> another story.
>>>>>>>>>>>>> I am not sure getting your points. vDPA now reports device
>>>>>>>>>>>>> feature bits(device_features) and negotiated feature
>>>>>>>>>>>>> bits(driver_features), and yes, the drivers features can be a
>>>>>>>>>>>>> subset of the device features; and the vDPA device 
>>>>>>>>>>>>> features can
>>>>>>>>>>>>> be a subset of the management device features.
>>>>>>>>>>>> What I said is after unblocking the conditional check, you'd
>>>>>>>>>>>> have to handle the case for each of the vdpa attribute when
>>>>>>>>>>>> feature negotiation is not yet done: basically the register
>>>>>>>>>>>> values you got from config space via the
>>>>>>>>>>>> vdpa_get_config_unlocked() call is not considered to be valid
>>>>>>>>>>>> before features_ok (per-spec). Although in some case you 
>>>>>>>>>>>> may get
>>>>>>>>>>>> sane value, such behavior is generally undefined. If you 
>>>>>>>>>>>> desire
>>>>>>>>>>>> to show just the device_features alone without any config 
>>>>>>>>>>>> space
>>>>>>>>>>>> field, which the device had advertised *before feature
>>>>>>>>>>>> negotiation is complete*, that'll be fine. But looks to me 
>>>>>>>>>>>> this
>>>>>>>>>>>> is not how patch has been implemented. Probably need some more
>>>>>>>>>>>> work?
>>>>>>>>>>> They are driver_features(negotiated) and the
>>>>>>>>>>> device_features(which comes with the device), and the config
>>>>>>>>>>> space fields that depend on them. In this series, we report 
>>>>>>>>>>> both
>>>>>>>>>>> to the userspace.
>>>>>>>>>> I fail to understand what you want to present from your
>>>>>>>>>> description. May be worth showing some example outputs that at
>>>>>>>>>> least include the following cases: 1) when device offers 
>>>>>>>>>> features
>>>>>>>>>> but not yet acknowledge by guest 2) when guest acknowledged
>>>>>>>>>> features and device is yet to accept 3) after guest feature
>>>>>>>>>> negotiation is completed (agreed upon between guest and device).
>>>>>>>>> Only two feature sets: 1) what the device has. (2) what is 
>>>>>>>>> negotiated
>>>>>>>>>> Thanks,
>>>>>>>>>> -Siwei
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> -Siwei
>>>>>>>>>>>>
>>>>>>>>>>>>>> Having said, please consider to drop the Fixes tag, as 
>>>>>>>>>>>>>> appears
>>>>>>>>>>>>>> to me you're proposing a new feature rather than fixing a 
>>>>>>>>>>>>>> real
>>>>>>>>>>>>>> issue.
>>>>>>>>>>>>> it's a new feature to report the device feature bits than 
>>>>>>>>>>>>> only
>>>>>>>>>>>>> negotiated features, however this patch is a must, or it will
>>>>>>>>>>>>> block the device feature bits reporting. but I agree, the fix
>>>>>>>>>>>>> tag is not a must.
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> -Siwei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Users may want to query the config space of a vDPA device,
>>>>>>>>>>>>>>>> to choose a
>>>>>>>>>>>>>>>> appropriate one for a certain guest. This means the users
>>>>>>>>>>>>>>>> need to read the
>>>>>>>>>>>>>>>> config space before FEATURES_OK, and the existence of 
>>>>>>>>>>>>>>>> config
>>>>>>>>>>>>>>>> space
>>>>>>>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The spec says:
>>>>>>>>>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>>>>>>>>>> configuration field
>>>>>>>>>>>>>>>> before FEATURES_OK is set by the driver. This includes
>>>>>>>>>>>>>>>> fields which are
>>>>>>>>>>>>>>>> conditional on feature bits, as long as those feature bits
>>>>>>>>>>>>>>>> are offered by the
>>>>>>>>>>>>>>>> device.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only
>>>>>>>>>>>>>>>> if FEATURES_OK)
>>>>>>>>>>>>>>> Fix is fine, but fixes tag needs correction described 
>>>>>>>>>>>>>>> below.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>>>>>>>> And
>>>>>>>>>>>>>>> It should be in format
>>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration 
>>>>>>>>>>>>>>> only if
>>>>>>>>>>>>>>> FEATURES_OK")
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please use checkpatch.pl script before posting the 
>>>>>>>>>>>>>>> patches to
>>>>>>>>>>>>>>> catch these errors.
>>>>>>>>>>>>>>> There is a bot that looks at the fixes tag and 
>>>>>>>>>>>>>>> identifies the
>>>>>>>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c 
>>>>>>>>>>>>>>>> index
>>>>>>>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct 
>>>>>>>>>>>>>>>> vdpa_device
>>>>>>>>>>>>>>>> *vdev,
>>>>>>>>>>>>>>>> struct sk_buff *msg, u32 portid, {
>>>>>>>>>>>>>>>>        u32 device_id;
>>>>>>>>>>>>>>>>        void *hdr;
>>>>>>>>>>>>>>>> -    u8 status;
>>>>>>>>>>>>>>>>        int err;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> down_read(&vdev->cf_lock);
>>>>>>>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>>>>>>>> - NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>>>>>>>>>> completed");
>>>>>>>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>>>>>>>> -        goto out;
>>>>>>>>>>>>>>>> -    }
>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, 
>>>>>>>>>>>>>>>> &vdpa_nl_family,
>>>>>>>>>>>>>>>> flags,
>>>>>>>>>>>>>>>> VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>>>>>>>        if (!hdr) {
>>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>>> 2.31.1
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Virtualization mailing list
>>>>>>>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$ 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>
>>
>


^ permalink raw reply	[flat|nested] 113+ messages in thread

end of thread, other threads:[~2022-08-04  1:41 UTC | newest]

Thread overview: 113+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-01 13:28 [PATCH V3 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
2022-07-01 13:28 ` [PATCH V3 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation Zhu Lingshan
2022-07-04  4:39   ` Jason Wang
2022-07-08  6:44     ` Zhu, Lingshan
2022-07-13  5:44       ` Michael S. Tsirkin
2022-07-13  7:52         ` Zhu, Lingshan
2022-07-13  5:31   ` Michael S. Tsirkin
2022-07-13  7:48     ` Zhu, Lingshan
2022-07-01 13:28 ` [PATCH V3 2/6] vDPA/ifcvf: support userspace to query features and MQ of a management device Zhu Lingshan
2022-07-04  4:43   ` Jason Wang
2022-07-08  6:54     ` Zhu, Lingshan
2022-07-01 13:28 ` [PATCH V3 3/6] vDPA: allow userspace to query features of a vDPA device Zhu Lingshan
2022-07-01 22:02   ` Parav Pandit
2022-07-04  4:46     ` Jason Wang
2022-07-04 12:53       ` Parav Pandit
2022-07-05  7:59         ` Zhu, Lingshan
2022-07-05 11:56           ` Parav Pandit
2022-07-05 16:56             ` Zhu, Lingshan
2022-07-05 17:01               ` Parav Pandit
2022-07-06  2:25                 ` Zhu, Lingshan
2022-07-06  2:28                   ` Parav Pandit
2022-07-23 11:27                   ` Zhu, Lingshan
2022-07-24 15:23                     ` Parav Pandit
2022-07-27  8:15             ` Si-Wei Liu
2022-07-27 11:38               ` Zhu, Lingshan
2022-07-08  6:16     ` Zhu, Lingshan
2022-07-08 16:13       ` Parav Pandit
2022-07-11  2:18         ` Zhu, Lingshan
2022-07-01 13:28 ` [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space Zhu Lingshan
2022-07-01 22:12   ` Parav Pandit
2022-07-08  6:22     ` Zhu, Lingshan
2022-07-13  5:23     ` Michael S. Tsirkin
2022-07-13  7:46       ` Zhu, Lingshan
     [not found]     ` <00889067-50ac-d2cd-675f-748f171e5c83@oracle.com>
     [not found]       ` <63242254-ba84-6810-dad8-34f900b97f2f@intel.com>
     [not found]         ` <8002554a-a77c-7b25-8f99-8d68248a741d@oracle.com>
2022-07-28  2:06           ` Jason Wang
2022-07-28  7:08             ` Si-Wei Liu
2022-07-28  7:36               ` Jason Wang
2022-07-28  7:44                 ` Zhu, Lingshan
     [not found]                 ` <2dfff5f3-3100-4a63-6da3-3e3d21ffb364@oracle.com>
2022-07-28 11:28                   ` spec clarification (was Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space) Michael S. Tsirkin
2022-07-28 11:35               ` [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space Michael S. Tsirkin
2022-07-28 22:12                 ` Si-Wei Liu
     [not found]           ` <00e2e07e-1a2e-7af8-a060-cc9034e0d33f@intel.com>
     [not found]             ` <b58dba25-3258-d600-ea06-879094639852@oracle.com>
     [not found]               ` <c143e2da-208e-b046-9b8f-1780f75ed3e6@intel.com>
2022-07-29 20:55                 ` Si-Wei Liu
2022-08-01  4:44                   ` Jason Wang
2022-08-01 22:53                     ` Si-Wei Liu
2022-08-01 22:58                       ` Si-Wei Liu
2022-08-02  6:33                         ` Jason Wang
2022-08-03  1:26                           ` Si-Wei Liu
2022-08-03  2:30                             ` Zhu, Lingshan
2022-08-03 23:09                               ` Si-Wei Liu
2022-08-04  1:41                                 ` Zhu, Lingshan
2022-08-04  1:41                                 ` Zhu, Lingshan
2022-07-01 13:28 ` [PATCH V3 5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0 Zhu Lingshan
2022-07-01 22:07   ` Parav Pandit
2022-07-08  6:21     ` Zhu, Lingshan
2022-07-08 16:23       ` Parav Pandit
2022-07-11  2:29         ` Zhu, Lingshan
2022-07-12 16:48           ` Parav Pandit
2022-07-13  3:03             ` Zhu, Lingshan
2022-07-13  3:06               ` Parav Pandit
2022-07-13  3:45                 ` Zhu, Lingshan
2022-07-26 15:56                   ` Parav Pandit
2022-07-26 19:52                     ` Michael S. Tsirkin
2022-07-26 20:49                       ` Parav Pandit
2022-07-27  2:14                     ` Zhu, Lingshan
2022-07-27  2:17                       ` Parav Pandit
2022-07-27  2:53                         ` Zhu, Lingshan
2022-07-27  3:47                           ` Parav Pandit
2022-07-27  4:24                             ` Zhu, Lingshan
2022-07-27  6:01                             ` Michael S. Tsirkin
2022-07-27  6:25                               ` Zhu, Lingshan
2022-07-27  6:56                                 ` Jason Wang
2022-07-27  9:05                                   ` Michael S. Tsirkin
2022-07-27  6:54                               ` Jason Wang
2022-07-27  9:02                                 ` Michael S. Tsirkin
2022-07-27  9:50                                   ` Jason Wang
2022-07-27 15:45                                     ` Michael S. Tsirkin
2022-07-28  1:21                                       ` Jason Wang
2022-07-28  3:46                                         ` Zhu, Lingshan
2022-07-28  5:53                                           ` Jason Wang
2022-07-28  6:02                                             ` Zhu, Lingshan
2022-07-28  6:41                                             ` Michael S. Tsirkin
2022-08-01  4:50                                               ` Jason Wang
2022-07-27  7:50                               ` Si-Wei Liu
2022-07-27  9:01                                 ` Michael S. Tsirkin
2022-07-27 10:09                                   ` Si-Wei Liu
2022-07-27 11:54                                     ` Zhu, Lingshan
2022-07-28  1:41                                       ` Si-Wei Liu
2022-07-28  2:44                                         ` Zhu, Lingshan
2022-07-28 21:54                                           ` Si-Wei Liu
2022-07-29  2:07                                             ` Zhu, Lingshan
2022-07-27 15:48                                     ` Michael S. Tsirkin
2022-07-13  5:26     ` Michael S. Tsirkin
2022-07-13  7:47       ` Zhu, Lingshan
2022-07-26 15:54       ` Parav Pandit
2022-07-26 19:48         ` Michael S. Tsirkin
2022-07-26 20:53           ` Parav Pandit
2022-07-27  1:56             ` Zhu, Lingshan
2022-07-27  2:11             ` Zhu, Lingshan
2022-07-01 13:28 ` [PATCH V3 6/6] vDPA: fix 'cast to restricted le16' warnings in vdpa.c Zhu Lingshan
2022-07-01 22:18   ` Parav Pandit
2022-07-08  6:25     ` Zhu, Lingshan
2022-07-08 16:08       ` Parav Pandit
2022-07-29  8:53   ` Michael S. Tsirkin
2022-07-29  9:07     ` Zhu, Lingshan
2022-07-29  9:17       ` Michael S. Tsirkin
2022-07-29  9:20         ` Zhu, Lingshan
2022-07-29  9:23           ` Michael S. Tsirkin
2022-07-29  9:35             ` Zhu, Lingshan
2022-07-29  9:39               ` Michael S. Tsirkin
2022-07-29 10:01                 ` Zhu, Lingshan
2022-07-29 10:16                   ` Michael S. Tsirkin
2022-07-29 10:18                     ` Zhu, Lingshan
2022-08-01  4:33                 ` Jason Wang
2022-08-01  6:25                   ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).