All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink
@ 2022-08-12 10:44 Zhu Lingshan
  2022-08-12 10:44 ` [PATCH V5 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation Zhu Lingshan
                   ` (6 more replies)
  0 siblings, 7 replies; 30+ messages in thread
From: Zhu Lingshan @ 2022-08-12 10:44 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, kvm, parav, xieyongji, gautam.dawar,
	Zhu Lingshan

This series allows userspace to query device config space of vDPA
devices and the management devices through netlink,
to get multi-queue, feature bits and etc.

This series has introduced a new netlink attr
VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, this should be used to query
features of vDPA  devices than the management device.

Please help review.

Thanks!
Zhu Lingshan

Changes rom V4:
(1) Read MAC, MTU, MQ conditionally (Michael)
(2) If VIRTIO_NET_F_MAC not set, don't report MAC to userspace
(3) If VIRTIO_NET_F_MTU not set, report 1500 to userspace
(4) Add comments to the new attr
VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES(Michael)
(5) Add comments for reporting the device status as LE(Michael)

Changes from V3:
(1)drop the fixes tags(Parva)
(2)better commit log for patch 1/6(Michael)
(3)assign num_queues to max_supported_vqs than max_vq_pairs(Jason)
(4)initialize virtio pci capabilities in the probe() function.

Changes from V2:
Add fixes tags(Parva)

Changes from V1:
(1) Use __virito16_to_cpu(true, xxx) for the le16 casting(Jason)
(2) Add a comment in ifcvf_get_config_size(), to explain
why we should return the minimum value of
sizeof(struct virtio_net_config) and the onboard
cap size(Jason)
(3) Introduced a new attr VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES
(4) Show the changes of iproute2 output before and after 5/6 patch(Jason)
(5) Fix cast warning in vdpa_fill_stats_rec() 

Zhu Lingshan (6):
  vDPA/ifcvf: get_config_size should return a value no greater than dev
    implementation
  vDPA/ifcvf: support userspace to query features and MQ of a management
    device
  vDPA: allow userspace to query features of a vDPA device
  vDPA: !FEATURES_OK should not block querying device config space
  vDPA: Conditionally read fields in virtio-net dev config space
  fix 'cast to restricted le16' warnings in vdpa.c

 drivers/vdpa/ifcvf/ifcvf_base.c |  13 ++-
 drivers/vdpa/ifcvf/ifcvf_base.h |   2 +
 drivers/vdpa/ifcvf/ifcvf_main.c | 142 +++++++++++++++++---------------
 drivers/vdpa/vdpa.c             |  82 ++++++++++++------
 include/uapi/linux/vdpa.h       |   3 +
 5 files changed, 149 insertions(+), 93 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH V5 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation
  2022-08-12 10:44 [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
@ 2022-08-12 10:44 ` Zhu Lingshan
  2022-08-12 10:44 ` [PATCH V5 2/6] vDPA/ifcvf: support userspace to query features and MQ of a management device Zhu Lingshan
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Zhu Lingshan @ 2022-08-12 10:44 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, kvm, parav, xieyongji, gautam.dawar,
	Zhu Lingshan

Drivers must not access a BAR outside the capability length,
and for a virtio device, ifcvf driver should not report any non-standard
capability contents to the upper layers.

Function ifcvf_get_config_size() is introduced here to return a safe value
of the device config capability size.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
 drivers/vdpa/ifcvf/ifcvf_base.c | 13 +++++++++++--
 drivers/vdpa/ifcvf/ifcvf_base.h |  2 ++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
index 48c4dadb0c7c..85611be5ccb4 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.c
+++ b/drivers/vdpa/ifcvf/ifcvf_base.c
@@ -128,6 +128,7 @@ int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *pdev)
 			break;
 		case VIRTIO_PCI_CAP_DEVICE_CFG:
 			hw->dev_cfg = get_cap_addr(hw, &cap);
+			hw->cap_dev_config_size = le32_to_cpu(cap.length);
 			IFCVF_DBG(pdev, "hw->dev_cfg = %p\n", hw->dev_cfg);
 			break;
 		}
@@ -233,15 +234,23 @@ int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features)
 u32 ifcvf_get_config_size(struct ifcvf_hw *hw)
 {
 	struct ifcvf_adapter *adapter;
+	u32 net_config_size = sizeof(struct virtio_net_config);
+	u32 blk_config_size = sizeof(struct virtio_blk_config);
+	u32 cap_size = hw->cap_dev_config_size;
 	u32 config_size;
 
 	adapter = vf_to_adapter(hw);
+	/* If the onboard device config space size is greater than
+	 * the size of struct virtio_net/blk_config, only the spec
+	 * implementing contents size is returned, this is very
+	 * unlikely, defensive programming.
+	 */
 	switch (hw->dev_type) {
 	case VIRTIO_ID_NET:
-		config_size = sizeof(struct virtio_net_config);
+		config_size = min(cap_size, net_config_size);
 		break;
 	case VIRTIO_ID_BLOCK:
-		config_size = sizeof(struct virtio_blk_config);
+		config_size = min(cap_size, blk_config_size);
 		break;
 	default:
 		config_size = 0;
diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
index 115b61f4924b..f5563f665cc6 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.h
+++ b/drivers/vdpa/ifcvf/ifcvf_base.h
@@ -87,6 +87,8 @@ struct ifcvf_hw {
 	int config_irq;
 	int vqs_reused_irq;
 	u16 nr_vring;
+	/* VIRTIO_PCI_CAP_DEVICE_CFG size */
+	u32 cap_dev_config_size;
 };
 
 struct ifcvf_adapter {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH V5 2/6] vDPA/ifcvf: support userspace to query features and MQ of a management device
  2022-08-12 10:44 [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
  2022-08-12 10:44 ` [PATCH V5 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation Zhu Lingshan
@ 2022-08-12 10:44 ` Zhu Lingshan
  2022-08-12 10:44 ` [PATCH V5 3/6] vDPA: allow userspace to query features of a vDPA device Zhu Lingshan
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Zhu Lingshan @ 2022-08-12 10:44 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, kvm, parav, xieyongji, gautam.dawar,
	Zhu Lingshan

Adapting to current netlink interfaces, this commit allows userspace
to query feature bits and MQ capability of a management device.

Currently both the vDPA device and the management device are the VF itself,
thus this ifcvf should initialize the virtio capabilities in probe() before
setting up the struct vdpa_mgmt_dev.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
 drivers/vdpa/ifcvf/ifcvf_main.c | 142 +++++++++++++++++---------------
 1 file changed, 76 insertions(+), 66 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index 0a5670729412..3fd0267873f8 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -752,59 +752,36 @@ static int ifcvf_vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
 {
 	struct ifcvf_vdpa_mgmt_dev *ifcvf_mgmt_dev;
 	struct ifcvf_adapter *adapter;
+	struct vdpa_device *vdpa_dev;
 	struct pci_dev *pdev;
 	struct ifcvf_hw *vf;
-	struct device *dev;
-	int ret, i;
+	int ret;
 
 	ifcvf_mgmt_dev = container_of(mdev, struct ifcvf_vdpa_mgmt_dev, mdev);
-	if (ifcvf_mgmt_dev->adapter)
+	if (!ifcvf_mgmt_dev->adapter)
 		return -EOPNOTSUPP;
 
-	pdev = ifcvf_mgmt_dev->pdev;
-	dev = &pdev->dev;
-	adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
-				    dev, &ifc_vdpa_ops, 1, 1, name, false);
-	if (IS_ERR(adapter)) {
-		IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
-		return PTR_ERR(adapter);
-	}
-
-	ifcvf_mgmt_dev->adapter = adapter;
-
+	adapter = ifcvf_mgmt_dev->adapter;
 	vf = &adapter->vf;
-	vf->dev_type = get_dev_type(pdev);
-	vf->base = pcim_iomap_table(pdev);
+	pdev = adapter->pdev;
+	vdpa_dev = &adapter->vdpa;
 
-	adapter->pdev = pdev;
-	adapter->vdpa.dma_dev = &pdev->dev;
-
-	ret = ifcvf_init_hw(vf, pdev);
-	if (ret) {
-		IFCVF_ERR(pdev, "Failed to init IFCVF hw\n");
-		goto err;
-	}
-
-	for (i = 0; i < vf->nr_vring; i++)
-		vf->vring[i].irq = -EINVAL;
-
-	vf->hw_features = ifcvf_get_hw_features(vf);
-	vf->config_size = ifcvf_get_config_size(vf);
+	if (name)
+		ret = dev_set_name(&vdpa_dev->dev, "%s", name);
+	else
+		ret = dev_set_name(&vdpa_dev->dev, "vdpa%u", vdpa_dev->index);
 
-	adapter->vdpa.mdev = &ifcvf_mgmt_dev->mdev;
 	ret = _vdpa_register_device(&adapter->vdpa, vf->nr_vring);
 	if (ret) {
+		put_device(&adapter->vdpa.dev);
 		IFCVF_ERR(pdev, "Failed to register to vDPA bus");
-		goto err;
+		return ret;
 	}
 
 	return 0;
-
-err:
-	put_device(&adapter->vdpa.dev);
-	return ret;
 }
 
+
 static void ifcvf_vdpa_dev_del(struct vdpa_mgmt_dev *mdev, struct vdpa_device *dev)
 {
 	struct ifcvf_vdpa_mgmt_dev *ifcvf_mgmt_dev;
@@ -823,61 +800,94 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct ifcvf_vdpa_mgmt_dev *ifcvf_mgmt_dev;
 	struct device *dev = &pdev->dev;
+	struct ifcvf_adapter *adapter;
+	struct ifcvf_hw *vf;
 	u32 dev_type;
-	int ret;
-
-	ifcvf_mgmt_dev = kzalloc(sizeof(struct ifcvf_vdpa_mgmt_dev), GFP_KERNEL);
-	if (!ifcvf_mgmt_dev) {
-		IFCVF_ERR(pdev, "Failed to alloc memory for the vDPA management device\n");
-		return -ENOMEM;
-	}
-
-	dev_type = get_dev_type(pdev);
-	switch (dev_type) {
-	case VIRTIO_ID_NET:
-		ifcvf_mgmt_dev->mdev.id_table = id_table_net;
-		break;
-	case VIRTIO_ID_BLOCK:
-		ifcvf_mgmt_dev->mdev.id_table = id_table_blk;
-		break;
-	default:
-		IFCVF_ERR(pdev, "VIRTIO ID %u not supported\n", dev_type);
-		ret = -EOPNOTSUPP;
-		goto err;
-	}
-
-	ifcvf_mgmt_dev->mdev.ops = &ifcvf_vdpa_mgmt_dev_ops;
-	ifcvf_mgmt_dev->mdev.device = dev;
-	ifcvf_mgmt_dev->pdev = pdev;
+	int ret, i;
 
 	ret = pcim_enable_device(pdev);
 	if (ret) {
 		IFCVF_ERR(pdev, "Failed to enable device\n");
-		goto err;
+		return ret;
 	}
-
 	ret = pcim_iomap_regions(pdev, BIT(0) | BIT(2) | BIT(4),
 				 IFCVF_DRIVER_NAME);
 	if (ret) {
 		IFCVF_ERR(pdev, "Failed to request MMIO region\n");
-		goto err;
+		return ret;
 	}
 
 	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64));
 	if (ret) {
 		IFCVF_ERR(pdev, "No usable DMA configuration\n");
-		goto err;
+		return ret;
 	}
 
 	ret = devm_add_action_or_reset(dev, ifcvf_free_irq_vectors, pdev);
 	if (ret) {
 		IFCVF_ERR(pdev,
 			  "Failed for adding devres for freeing irq vectors\n");
-		goto err;
+		return ret;
 	}
 
 	pci_set_master(pdev);
 
+	adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
+				    dev, &ifc_vdpa_ops, 1, 1, NULL, false);
+	if (IS_ERR(adapter)) {
+		IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
+		return PTR_ERR(adapter);
+	}
+
+	vf = &adapter->vf;
+	vf->dev_type = get_dev_type(pdev);
+	vf->base = pcim_iomap_table(pdev);
+
+	adapter->pdev = pdev;
+	adapter->vdpa.dma_dev = &pdev->dev;
+
+	ret = ifcvf_init_hw(vf, pdev);
+	if (ret) {
+		IFCVF_ERR(pdev, "Failed to init IFCVF hw\n");
+		return ret;
+	}
+
+	for (i = 0; i < vf->nr_vring; i++)
+		vf->vring[i].irq = -EINVAL;
+
+	vf->hw_features = ifcvf_get_hw_features(vf);
+	vf->config_size = ifcvf_get_config_size(vf);
+
+	ifcvf_mgmt_dev = kzalloc(sizeof(struct ifcvf_vdpa_mgmt_dev), GFP_KERNEL);
+	if (!ifcvf_mgmt_dev) {
+		IFCVF_ERR(pdev, "Failed to alloc memory for the vDPA management device\n");
+		return -ENOMEM;
+	}
+
+	ifcvf_mgmt_dev->mdev.ops = &ifcvf_vdpa_mgmt_dev_ops;
+	ifcvf_mgmt_dev->mdev.device = dev;
+	ifcvf_mgmt_dev->adapter = adapter;
+
+	dev_type = get_dev_type(pdev);
+	switch (dev_type) {
+	case VIRTIO_ID_NET:
+		ifcvf_mgmt_dev->mdev.id_table = id_table_net;
+		break;
+	case VIRTIO_ID_BLOCK:
+		ifcvf_mgmt_dev->mdev.id_table = id_table_blk;
+		break;
+	default:
+		IFCVF_ERR(pdev, "VIRTIO ID %u not supported\n", dev_type);
+		ret = -EOPNOTSUPP;
+		goto err;
+	}
+
+	ifcvf_mgmt_dev->mdev.max_supported_vqs = vf->nr_vring;
+	ifcvf_mgmt_dev->mdev.supported_features = vf->hw_features;
+
+	adapter->vdpa.mdev = &ifcvf_mgmt_dev->mdev;
+
+
 	ret = vdpa_mgmtdev_register(&ifcvf_mgmt_dev->mdev);
 	if (ret) {
 		IFCVF_ERR(pdev,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH V5 3/6] vDPA: allow userspace to query features of a vDPA device
  2022-08-12 10:44 [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
  2022-08-12 10:44 ` [PATCH V5 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation Zhu Lingshan
  2022-08-12 10:44 ` [PATCH V5 2/6] vDPA/ifcvf: support userspace to query features and MQ of a management device Zhu Lingshan
@ 2022-08-12 10:44 ` Zhu Lingshan
  2022-08-12 10:44 ` [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space Zhu Lingshan
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Zhu Lingshan @ 2022-08-12 10:44 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, kvm, parav, xieyongji, gautam.dawar,
	Zhu Lingshan

This commit adds a new vDPA netlink attribution
VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
features of vDPA devices through this new attr.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
 drivers/vdpa/vdpa.c       | 17 +++++++++++++----
 include/uapi/linux/vdpa.h |  3 +++
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index ebf2f363fbe7..6eb3d972d802 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -491,6 +491,8 @@ static int vdpa_mgmtdev_fill(const struct vdpa_mgmt_dev *mdev, struct sk_buff *m
 		err = -EMSGSIZE;
 		goto msg_err;
 	}
+
+	/* report features of a vDPA management device through VDPA_ATTR_DEV_SUPPORTED_FEATURES */
 	if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_SUPPORTED_FEATURES,
 			      mdev->supported_features, VDPA_ATTR_PAD)) {
 		err = -EMSGSIZE;
@@ -815,7 +817,7 @@ static int vdpa_dev_net_mq_config_fill(struct vdpa_device *vdev,
 static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *msg)
 {
 	struct virtio_net_config config = {};
-	u64 features;
+	u64 features_device, features_driver;
 	u16 val_u16;
 
 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
@@ -832,12 +834,19 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
 		return -EMSGSIZE;
 
-	features = vdev->config->get_driver_features(vdev);
-	if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
+	features_driver = vdev->config->get_driver_features(vdev);
+	if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features_driver,
+			      VDPA_ATTR_PAD))
+		return -EMSGSIZE;
+
+	features_device = vdev->config->get_device_features(vdev);
+
+	/* report features of a vDPA device through VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES */
+	if (nla_put_u64_64bit(msg, VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, features_device,
 			      VDPA_ATTR_PAD))
 		return -EMSGSIZE;
 
-	return vdpa_dev_net_mq_config_fill(vdev, msg, features, &config);
+	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver, &config);
 }
 
 static int
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index 25c55cab3d7c..d171b92ef522 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -46,7 +46,10 @@ enum vdpa_attr {
 
 	VDPA_ATTR_DEV_NEGOTIATED_FEATURES,	/* u64 */
 	VDPA_ATTR_DEV_MGMTDEV_MAX_VQS,		/* u32 */
+	/* features of a vDPA management device */
 	VDPA_ATTR_DEV_SUPPORTED_FEATURES,	/* u64 */
+	/* features of a vDPA device, e.g., /dev/vhost-vdpa0 */
+	VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES,	/* u64 */
 
 	VDPA_ATTR_DEV_QUEUE_INDEX,              /* u32 */
 	VDPA_ATTR_DEV_VENDOR_ATTR_NAME,		/* string */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-12 10:44 [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
                   ` (2 preceding siblings ...)
  2022-08-12 10:44 ` [PATCH V5 3/6] vDPA: allow userspace to query features of a vDPA device Zhu Lingshan
@ 2022-08-12 10:44 ` Zhu Lingshan
  2022-08-16  7:41     ` Si-Wei Liu
  2022-08-12 10:44 ` [PATCH V5 5/6] vDPA: Conditionally read fields in virtio-net dev " Zhu Lingshan
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Zhu Lingshan @ 2022-08-12 10:44 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, kvm, parav, xieyongji, gautam.dawar,
	Zhu Lingshan

Users may want to query the config space of a vDPA device,
to choose a appropriate one for a certain guest. This means the
users need to read the config space before FEATURES_OK, and
the existence of config space contents does not depend on
FEATURES_OK.

The spec says:
The device MUST allow reading of any device-specific configuration
field before FEATURES_OK is set by the driver. This includes
fields which are conditional on feature bits, as long as those
feature bits are offered by the device.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
 drivers/vdpa/vdpa.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 6eb3d972d802..bf312d9c59ab 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid,
 {
 	u32 device_id;
 	void *hdr;
-	u8 status;
 	int err;
 
 	down_read(&vdev->cf_lock);
-	status = vdev->config->get_status(vdev);
-	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
-		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not completed");
-		err = -EAGAIN;
-		goto out;
-	}
-
 	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
 			  VDPA_CMD_DEV_CONFIG_GET);
 	if (!hdr) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH V5 5/6] vDPA: Conditionally read fields in virtio-net dev config space
  2022-08-12 10:44 [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
                   ` (3 preceding siblings ...)
  2022-08-12 10:44 ` [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space Zhu Lingshan
@ 2022-08-12 10:44 ` Zhu Lingshan
  2022-08-12 10:45 ` [PATCH V5 6/6] fix 'cast to restricted le16' warnings in vdpa.c Zhu Lingshan
  2022-08-12 11:14   ` Michael S. Tsirkin
  6 siblings, 0 replies; 30+ messages in thread
From: Zhu Lingshan @ 2022-08-12 10:44 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, kvm, parav, xieyongji, gautam.dawar,
	Zhu Lingshan

Some fields of virtio-net device config space are
conditional on the feature bits, the spec says:

"The mac address field always exists
(though is only valid if VIRTIO_NET_F_MAC is set)"

"max_virtqueue_pairs only exists if VIRTIO_NET_F_MQ
or VIRTIO_NET_F_RSS is set"

"mtu only exists if VIRTIO_NET_F_MTU is set"

so we should read MTU, MAC and MQ in the device config
space only when these feature bits are offered.

For MQ, if both VIRTIO_NET_F_MQ and VIRTIO_NET_F_RSS are
not set, the virtio device should have
one queue pair as default value, so when userspace querying queue pair numbers,
it should return mq=1 than zero.

For MTU, if VIRTIO_NET_F_MTU is not set, we should not read
MTU from the device config sapce.
RFC894 <A Standard for the Transmission of IP Datagrams over Ethernet Networks>
says:"The minimum length of the data field of a packet sent over an
Ethernet is 1500 octets, thus the maximum length of an IP datagram
sent over an Ethernet is 1500 octets.  Implementations are encouraged
to support full-length packets"

virtio spec says:"The virtio network device is a virtual ethernet card",
so the default MTU value should be 1500 for virtio-net.

For MAC, the spec says:"If the VIRTIO_NET_F_MAC feature bit is set,
the configuration space mac entry indicates the “physical” address
of the network card, otherwise the driver would typically
generate a random local MAC address." So there is no
default MAC address if VIRTIO_NET_F_MAC not set.

This commits introduces functions vdpa_dev_net_mtu_config_fill()
and vdpa_dev_net_mac_config_fill() to fill MTU and MAC.
It also fixes vdpa_dev_net_mq_config_fill() to report correct
MQ when _F_MQ is not present.

These functions should check devices features than driver
features, and struct vdpa_device is not needed as a parameter

The test & userspace tool output:

Feature bit VIRTIO_NET_F_MTU, VIRTIO_NET_F_RSS, VIRTIO_NET_F_MQ
and VIRTIO_NET_F_MAC can be mask out by hardcode.

However, it is challenging to "disable" the related fields
in the HW device config space, so let's just assume the values
are meaningless if the feature bits are not set.

Before this change, when feature bits for RSS, MQ, MTU and MAC
are not set, iproute2 output:
$vdpa vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false mtu 1500
  negotiated_features

without this commit, function vdpa_dev_net_config_fill()
reads all config space fields unconditionally, so let's
assume the MAC and MTU are meaningless, and it checks
MQ with driver_features, so we don't see max_vq_pairs.

After applying this commit, when feature bits for
MQ, RSS, MAC and MTU are not set 0,iproute2 output:
$vdpa dev config show vdpa0
vdpa0: link up link_announce false max_vq_pairs 1 mtu 1500
  negotiated_features

As explained above:
Here is no MAC, because VIRTIO_NET_F_MAC is not set,
and there is no default value for MAC. It shows
max_vq_paris = 1 because even without MQ feature,
a functional virtio-net must have one queue pair.
mtu = 1500 is the default value as ethernet
required.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
 drivers/vdpa/vdpa.c | 51 ++++++++++++++++++++++++++++++++-------------
 1 file changed, 37 insertions(+), 14 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index bf312d9c59ab..f6172c8dd262 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -801,19 +801,44 @@ static int vdpa_nl_cmd_dev_get_dumpit(struct sk_buff *msg, struct netlink_callba
 	return msg->len;
 }
 
-static int vdpa_dev_net_mq_config_fill(struct vdpa_device *vdev,
-				       struct sk_buff *msg, u64 features,
+static int vdpa_dev_net_mq_config_fill(struct sk_buff *msg, u64 features,
 				       const struct virtio_net_config *config)
 {
 	u16 val_u16;
 
-	if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0)
-		return 0;
+	if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0 &&
+	    (features & BIT_ULL(VIRTIO_NET_F_RSS)) == 0)
+		val_u16 = 1;
+	else
+		val_u16 = __virtio16_to_cpu(true, config->max_virtqueue_pairs);
 
-	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
 	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, val_u16);
 }
 
+static int vdpa_dev_net_mtu_config_fill(struct sk_buff *msg, u64 features,
+					const struct virtio_net_config *config)
+{
+	u16 val_u16;
+
+	if ((features & BIT_ULL(VIRTIO_NET_F_MTU)) == 0)
+		val_u16 = 1500;
+	else
+		val_u16 = __virtio16_to_cpu(true, config->mtu);
+
+	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16);
+}
+
+static int vdpa_dev_net_mac_config_fill(struct sk_buff *msg, u64 features,
+					const struct virtio_net_config *config)
+{
+	if ((features & BIT_ULL(VIRTIO_NET_F_MAC)) == 0)
+		return 0;
+	else
+		return  nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR,
+				sizeof(config->mac), config->mac);
+}
+
+
 static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *msg)
 {
 	struct virtio_net_config config = {};
@@ -822,18 +847,10 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
 
 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
 
-	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
-		    config.mac))
-		return -EMSGSIZE;
-
 	val_u16 = le16_to_cpu(config.status);
 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
 		return -EMSGSIZE;
 
-	val_u16 = le16_to_cpu(config.mtu);
-	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
-		return -EMSGSIZE;
-
 	features_driver = vdev->config->get_driver_features(vdev);
 	if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features_driver,
 			      VDPA_ATTR_PAD))
@@ -846,7 +863,13 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
 			      VDPA_ATTR_PAD))
 		return -EMSGSIZE;
 
-	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver, &config);
+	if (vdpa_dev_net_mac_config_fill(msg, features_device, &config))
+		return -EMSGSIZE;
+
+	if (vdpa_dev_net_mtu_config_fill(msg, features_device, &config))
+		return -EMSGSIZE;
+
+	return vdpa_dev_net_mq_config_fill(msg, features_device, &config);
 }
 
 static int
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH V5 6/6] fix 'cast to restricted le16' warnings in vdpa.c
  2022-08-12 10:44 [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
                   ` (4 preceding siblings ...)
  2022-08-12 10:44 ` [PATCH V5 5/6] vDPA: Conditionally read fields in virtio-net dev " Zhu Lingshan
@ 2022-08-12 10:45 ` Zhu Lingshan
  2022-08-12 11:14   ` Michael S. Tsirkin
  6 siblings, 0 replies; 30+ messages in thread
From: Zhu Lingshan @ 2022-08-12 10:45 UTC (permalink / raw)
  To: jasowang, mst
  Cc: virtualization, netdev, kvm, parav, xieyongji, gautam.dawar,
	Zhu Lingshan

This commit fixes spars warnings: cast to restricted __le16
in function vdpa_dev_net_config_fill() and
vdpa_fill_stats_rec()

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
---
 drivers/vdpa/vdpa.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index f6172c8dd262..0e5fa97dfcc3 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -847,7 +847,11 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
 
 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
 
-	val_u16 = le16_to_cpu(config.status);
+	/*
+	 * Assume little endian for now, userspace can tweak this for
+	 * legacy guest support.
+	 */
+	val_u16 = __virtio16_to_cpu(true, config.status);
 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
 		return -EMSGSIZE;
 
@@ -937,7 +941,11 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,
 	}
 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
 
-	max_vqp = le16_to_cpu(config.max_virtqueue_pairs);
+	/*
+	 * Assume little endian for now, userspace can tweak this for
+	 * legacy guest support.
+	 */
+	max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
 		return -EMSGSIZE;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink
  2022-08-12 10:44 [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
@ 2022-08-12 11:14   ` Michael S. Tsirkin
  2022-08-12 10:44 ` [PATCH V5 2/6] vDPA/ifcvf: support userspace to query features and MQ of a management device Zhu Lingshan
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-08-12 11:14 UTC (permalink / raw)
  To: Zhu Lingshan
  Cc: jasowang, virtualization, netdev, kvm, parav, xieyongji, gautam.dawar

On Fri, Aug 12, 2022 at 06:44:54PM +0800, Zhu Lingshan wrote:
> This series allows userspace to query device config space of vDPA
> devices and the management devices through netlink,
> to get multi-queue, feature bits and etc.
> 
> This series has introduced a new netlink attr
> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, this should be used to query
> features of vDPA  devices than the management device.
> 
> Please help review.

I can't merge this for this merge window.
Am I right when I say that the new thing here is patch 5/6 + new
comments?
If yes I can queue it out of the window, on top.

> Thanks!
> Zhu Lingshan
> 
> Changes rom V4:
> (1) Read MAC, MTU, MQ conditionally (Michael)
> (2) If VIRTIO_NET_F_MAC not set, don't report MAC to userspace
> (3) If VIRTIO_NET_F_MTU not set, report 1500 to userspace
> (4) Add comments to the new attr
> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES(Michael)
> (5) Add comments for reporting the device status as LE(Michael)
> 
> Changes from V3:
> (1)drop the fixes tags(Parva)
> (2)better commit log for patch 1/6(Michael)
> (3)assign num_queues to max_supported_vqs than max_vq_pairs(Jason)
> (4)initialize virtio pci capabilities in the probe() function.
> 
> Changes from V2:
> Add fixes tags(Parva)
> 
> Changes from V1:
> (1) Use __virito16_to_cpu(true, xxx) for the le16 casting(Jason)
> (2) Add a comment in ifcvf_get_config_size(), to explain
> why we should return the minimum value of
> sizeof(struct virtio_net_config) and the onboard
> cap size(Jason)
> (3) Introduced a new attr VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES
> (4) Show the changes of iproute2 output before and after 5/6 patch(Jason)
> (5) Fix cast warning in vdpa_fill_stats_rec() 
> 
> Zhu Lingshan (6):
>   vDPA/ifcvf: get_config_size should return a value no greater than dev
>     implementation
>   vDPA/ifcvf: support userspace to query features and MQ of a management
>     device
>   vDPA: allow userspace to query features of a vDPA device
>   vDPA: !FEATURES_OK should not block querying device config space
>   vDPA: Conditionally read fields in virtio-net dev config space
>   fix 'cast to restricted le16' warnings in vdpa.c
> 
>  drivers/vdpa/ifcvf/ifcvf_base.c |  13 ++-
>  drivers/vdpa/ifcvf/ifcvf_base.h |   2 +
>  drivers/vdpa/ifcvf/ifcvf_main.c | 142 +++++++++++++++++---------------
>  drivers/vdpa/vdpa.c             |  82 ++++++++++++------
>  include/uapi/linux/vdpa.h       |   3 +
>  5 files changed, 149 insertions(+), 93 deletions(-)
> 
> -- 
> 2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink
@ 2022-08-12 11:14   ` Michael S. Tsirkin
  0 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-08-12 11:14 UTC (permalink / raw)
  To: Zhu Lingshan; +Cc: kvm, netdev, virtualization, xieyongji, gautam.dawar

On Fri, Aug 12, 2022 at 06:44:54PM +0800, Zhu Lingshan wrote:
> This series allows userspace to query device config space of vDPA
> devices and the management devices through netlink,
> to get multi-queue, feature bits and etc.
> 
> This series has introduced a new netlink attr
> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, this should be used to query
> features of vDPA  devices than the management device.
> 
> Please help review.

I can't merge this for this merge window.
Am I right when I say that the new thing here is patch 5/6 + new
comments?
If yes I can queue it out of the window, on top.

> Thanks!
> Zhu Lingshan
> 
> Changes rom V4:
> (1) Read MAC, MTU, MQ conditionally (Michael)
> (2) If VIRTIO_NET_F_MAC not set, don't report MAC to userspace
> (3) If VIRTIO_NET_F_MTU not set, report 1500 to userspace
> (4) Add comments to the new attr
> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES(Michael)
> (5) Add comments for reporting the device status as LE(Michael)
> 
> Changes from V3:
> (1)drop the fixes tags(Parva)
> (2)better commit log for patch 1/6(Michael)
> (3)assign num_queues to max_supported_vqs than max_vq_pairs(Jason)
> (4)initialize virtio pci capabilities in the probe() function.
> 
> Changes from V2:
> Add fixes tags(Parva)
> 
> Changes from V1:
> (1) Use __virito16_to_cpu(true, xxx) for the le16 casting(Jason)
> (2) Add a comment in ifcvf_get_config_size(), to explain
> why we should return the minimum value of
> sizeof(struct virtio_net_config) and the onboard
> cap size(Jason)
> (3) Introduced a new attr VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES
> (4) Show the changes of iproute2 output before and after 5/6 patch(Jason)
> (5) Fix cast warning in vdpa_fill_stats_rec() 
> 
> Zhu Lingshan (6):
>   vDPA/ifcvf: get_config_size should return a value no greater than dev
>     implementation
>   vDPA/ifcvf: support userspace to query features and MQ of a management
>     device
>   vDPA: allow userspace to query features of a vDPA device
>   vDPA: !FEATURES_OK should not block querying device config space
>   vDPA: Conditionally read fields in virtio-net dev config space
>   fix 'cast to restricted le16' warnings in vdpa.c
> 
>  drivers/vdpa/ifcvf/ifcvf_base.c |  13 ++-
>  drivers/vdpa/ifcvf/ifcvf_base.h |   2 +
>  drivers/vdpa/ifcvf/ifcvf_main.c | 142 +++++++++++++++++---------------
>  drivers/vdpa/vdpa.c             |  82 ++++++++++++------
>  include/uapi/linux/vdpa.h       |   3 +
>  5 files changed, 149 insertions(+), 93 deletions(-)
> 
> -- 
> 2.31.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink
  2022-08-12 11:14   ` Michael S. Tsirkin
@ 2022-08-12 11:17     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-08-12 11:17 UTC (permalink / raw)
  To: Zhu Lingshan
  Cc: jasowang, virtualization, netdev, kvm, parav, xieyongji, gautam.dawar

On Fri, Aug 12, 2022 at 07:14:39AM -0400, Michael S. Tsirkin wrote:
> On Fri, Aug 12, 2022 at 06:44:54PM +0800, Zhu Lingshan wrote:
> > This series allows userspace to query device config space of vDPA
> > devices and the management devices through netlink,
> > to get multi-queue, feature bits and etc.
> > 
> > This series has introduced a new netlink attr
> > VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, this should be used to query
> > features of vDPA  devices than the management device.
> > 
> > Please help review.
> 
> I can't merge this for this merge window.
> Am I right when I say that the new thing here is patch 5/6 + new
> comments?
> If yes I can queue it out of the window, on top.

So at this point, can you please send patches on top of the vhost
tree? I think these are just patches 3 and 5 but please confirm.


> > Thanks!
> > Zhu Lingshan
> > 
> > Changes rom V4:
> > (1) Read MAC, MTU, MQ conditionally (Michael)
> > (2) If VIRTIO_NET_F_MAC not set, don't report MAC to userspace
> > (3) If VIRTIO_NET_F_MTU not set, report 1500 to userspace
> > (4) Add comments to the new attr
> > VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES(Michael)
> > (5) Add comments for reporting the device status as LE(Michael)
> > 
> > Changes from V3:
> > (1)drop the fixes tags(Parva)
> > (2)better commit log for patch 1/6(Michael)
> > (3)assign num_queues to max_supported_vqs than max_vq_pairs(Jason)
> > (4)initialize virtio pci capabilities in the probe() function.
> > 
> > Changes from V2:
> > Add fixes tags(Parva)
> > 
> > Changes from V1:
> > (1) Use __virito16_to_cpu(true, xxx) for the le16 casting(Jason)
> > (2) Add a comment in ifcvf_get_config_size(), to explain
> > why we should return the minimum value of
> > sizeof(struct virtio_net_config) and the onboard
> > cap size(Jason)
> > (3) Introduced a new attr VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES
> > (4) Show the changes of iproute2 output before and after 5/6 patch(Jason)
> > (5) Fix cast warning in vdpa_fill_stats_rec() 
> > 
> > Zhu Lingshan (6):
> >   vDPA/ifcvf: get_config_size should return a value no greater than dev
> >     implementation
> >   vDPA/ifcvf: support userspace to query features and MQ of a management
> >     device
> >   vDPA: allow userspace to query features of a vDPA device
> >   vDPA: !FEATURES_OK should not block querying device config space
> >   vDPA: Conditionally read fields in virtio-net dev config space
> >   fix 'cast to restricted le16' warnings in vdpa.c
> > 
> >  drivers/vdpa/ifcvf/ifcvf_base.c |  13 ++-
> >  drivers/vdpa/ifcvf/ifcvf_base.h |   2 +
> >  drivers/vdpa/ifcvf/ifcvf_main.c | 142 +++++++++++++++++---------------
> >  drivers/vdpa/vdpa.c             |  82 ++++++++++++------
> >  include/uapi/linux/vdpa.h       |   3 +
> >  5 files changed, 149 insertions(+), 93 deletions(-)
> > 
> > -- 
> > 2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink
@ 2022-08-12 11:17     ` Michael S. Tsirkin
  0 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-08-12 11:17 UTC (permalink / raw)
  To: Zhu Lingshan; +Cc: kvm, netdev, virtualization, xieyongji, gautam.dawar

On Fri, Aug 12, 2022 at 07:14:39AM -0400, Michael S. Tsirkin wrote:
> On Fri, Aug 12, 2022 at 06:44:54PM +0800, Zhu Lingshan wrote:
> > This series allows userspace to query device config space of vDPA
> > devices and the management devices through netlink,
> > to get multi-queue, feature bits and etc.
> > 
> > This series has introduced a new netlink attr
> > VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, this should be used to query
> > features of vDPA  devices than the management device.
> > 
> > Please help review.
> 
> I can't merge this for this merge window.
> Am I right when I say that the new thing here is patch 5/6 + new
> comments?
> If yes I can queue it out of the window, on top.

So at this point, can you please send patches on top of the vhost
tree? I think these are just patches 3 and 5 but please confirm.


> > Thanks!
> > Zhu Lingshan
> > 
> > Changes rom V4:
> > (1) Read MAC, MTU, MQ conditionally (Michael)
> > (2) If VIRTIO_NET_F_MAC not set, don't report MAC to userspace
> > (3) If VIRTIO_NET_F_MTU not set, report 1500 to userspace
> > (4) Add comments to the new attr
> > VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES(Michael)
> > (5) Add comments for reporting the device status as LE(Michael)
> > 
> > Changes from V3:
> > (1)drop the fixes tags(Parva)
> > (2)better commit log for patch 1/6(Michael)
> > (3)assign num_queues to max_supported_vqs than max_vq_pairs(Jason)
> > (4)initialize virtio pci capabilities in the probe() function.
> > 
> > Changes from V2:
> > Add fixes tags(Parva)
> > 
> > Changes from V1:
> > (1) Use __virito16_to_cpu(true, xxx) for the le16 casting(Jason)
> > (2) Add a comment in ifcvf_get_config_size(), to explain
> > why we should return the minimum value of
> > sizeof(struct virtio_net_config) and the onboard
> > cap size(Jason)
> > (3) Introduced a new attr VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES
> > (4) Show the changes of iproute2 output before and after 5/6 patch(Jason)
> > (5) Fix cast warning in vdpa_fill_stats_rec() 
> > 
> > Zhu Lingshan (6):
> >   vDPA/ifcvf: get_config_size should return a value no greater than dev
> >     implementation
> >   vDPA/ifcvf: support userspace to query features and MQ of a management
> >     device
> >   vDPA: allow userspace to query features of a vDPA device
> >   vDPA: !FEATURES_OK should not block querying device config space
> >   vDPA: Conditionally read fields in virtio-net dev config space
> >   fix 'cast to restricted le16' warnings in vdpa.c
> > 
> >  drivers/vdpa/ifcvf/ifcvf_base.c |  13 ++-
> >  drivers/vdpa/ifcvf/ifcvf_base.h |   2 +
> >  drivers/vdpa/ifcvf/ifcvf_main.c | 142 +++++++++++++++++---------------
> >  drivers/vdpa/vdpa.c             |  82 ++++++++++++------
> >  include/uapi/linux/vdpa.h       |   3 +
> >  5 files changed, 149 insertions(+), 93 deletions(-)
> > 
> > -- 
> > 2.31.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink
  2022-08-12 11:17     ` Michael S. Tsirkin
  (?)
@ 2022-08-12 11:41     ` Zhu, Lingshan
  2022-08-15  9:36       ` Zhu, Lingshan
  -1 siblings, 1 reply; 30+ messages in thread
From: Zhu, Lingshan @ 2022-08-12 11:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtualization, netdev, kvm, parav, xieyongji, gautam.dawar



On 8/12/2022 7:17 PM, Michael S. Tsirkin wrote:
> On Fri, Aug 12, 2022 at 07:14:39AM -0400, Michael S. Tsirkin wrote:
>> On Fri, Aug 12, 2022 at 06:44:54PM +0800, Zhu Lingshan wrote:
>>> This series allows userspace to query device config space of vDPA
>>> devices and the management devices through netlink,
>>> to get multi-queue, feature bits and etc.
>>>
>>> This series has introduced a new netlink attr
>>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, this should be used to query
>>> features of vDPA  devices than the management device.
>>>
>>> Please help review.
>> I can't merge this for this merge window.
>> Am I right when I say that the new thing here is patch 5/6 + new
>> comments?
>> If yes I can queue it out of the window, on top.
> So at this point, can you please send patches on top of the vhost
> tree? I think these are just patches 3 and 5 but please confirm.
I will rebase them on vhost tree and resend them soon, main changes are 
in patch 5,
we have made MTU, MAC, MQ conditional there. And there are some new 
comments as
you suggested.


Thanks,
Zhu Lingshan
>
>
>>> Thanks!
>>> Zhu Lingshan
>>>
>>> Changes rom V4:
>>> (1) Read MAC, MTU, MQ conditionally (Michael)
>>> (2) If VIRTIO_NET_F_MAC not set, don't report MAC to userspace
>>> (3) If VIRTIO_NET_F_MTU not set, report 1500 to userspace
>>> (4) Add comments to the new attr
>>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES(Michael)
>>> (5) Add comments for reporting the device status as LE(Michael)
>>>
>>> Changes from V3:
>>> (1)drop the fixes tags(Parva)
>>> (2)better commit log for patch 1/6(Michael)
>>> (3)assign num_queues to max_supported_vqs than max_vq_pairs(Jason)
>>> (4)initialize virtio pci capabilities in the probe() function.
>>>
>>> Changes from V2:
>>> Add fixes tags(Parva)
>>>
>>> Changes from V1:
>>> (1) Use __virito16_to_cpu(true, xxx) for the le16 casting(Jason)
>>> (2) Add a comment in ifcvf_get_config_size(), to explain
>>> why we should return the minimum value of
>>> sizeof(struct virtio_net_config) and the onboard
>>> cap size(Jason)
>>> (3) Introduced a new attr VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES
>>> (4) Show the changes of iproute2 output before and after 5/6 patch(Jason)
>>> (5) Fix cast warning in vdpa_fill_stats_rec()
>>>
>>> Zhu Lingshan (6):
>>>    vDPA/ifcvf: get_config_size should return a value no greater than dev
>>>      implementation
>>>    vDPA/ifcvf: support userspace to query features and MQ of a management
>>>      device
>>>    vDPA: allow userspace to query features of a vDPA device
>>>    vDPA: !FEATURES_OK should not block querying device config space
>>>    vDPA: Conditionally read fields in virtio-net dev config space
>>>    fix 'cast to restricted le16' warnings in vdpa.c
>>>
>>>   drivers/vdpa/ifcvf/ifcvf_base.c |  13 ++-
>>>   drivers/vdpa/ifcvf/ifcvf_base.h |   2 +
>>>   drivers/vdpa/ifcvf/ifcvf_main.c | 142 +++++++++++++++++---------------
>>>   drivers/vdpa/vdpa.c             |  82 ++++++++++++------
>>>   include/uapi/linux/vdpa.h       |   3 +
>>>   5 files changed, 149 insertions(+), 93 deletions(-)
>>>
>>> -- 
>>> 2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink
  2022-08-12 11:41     ` Zhu, Lingshan
@ 2022-08-15  9:36       ` Zhu, Lingshan
  0 siblings, 0 replies; 30+ messages in thread
From: Zhu, Lingshan @ 2022-08-15  9:36 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtualization, netdev, kvm, parav, xieyongji, gautam.dawar



On 8/12/2022 7:41 PM, Zhu, Lingshan wrote:
>
>
> On 8/12/2022 7:17 PM, Michael S. Tsirkin wrote:
>> On Fri, Aug 12, 2022 at 07:14:39AM -0400, Michael S. Tsirkin wrote:
>>> On Fri, Aug 12, 2022 at 06:44:54PM +0800, Zhu Lingshan wrote:
>>>> This series allows userspace to query device config space of vDPA
>>>> devices and the management devices through netlink,
>>>> to get multi-queue, feature bits and etc.
>>>>
>>>> This series has introduced a new netlink attr
>>>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, this should be used to query
>>>> features of vDPA  devices than the management device.
>>>>
>>>> Please help review.
>>> I can't merge this for this merge window.
>>> Am I right when I say that the new thing here is patch 5/6 + new
>>> comments?
>>> If yes I can queue it out of the window, on top.
>> So at this point, can you please send patches on top of the vhost
>> tree? I think these are just patches 3 and 5 but please confirm.
> I will rebase them on vhost tree and resend them soon, main changes 
> are in patch 5,
> we have made MTU, MAC, MQ conditional there. And there are some new 
> comments as
> you suggested.
Hi Michael,

I have rebased patch 3/6 and 5/6, they can apply on both vhost tree
and Linus tree, the new series including these two patches are sent out.

Thanks,
Zhu Lingshan
>
>
> Thanks,
> Zhu Lingshan
>>
>>
>>>> Thanks!
>>>> Zhu Lingshan
>>>>
>>>> Changes rom V4:
>>>> (1) Read MAC, MTU, MQ conditionally (Michael)
>>>> (2) If VIRTIO_NET_F_MAC not set, don't report MAC to userspace
>>>> (3) If VIRTIO_NET_F_MTU not set, report 1500 to userspace
>>>> (4) Add comments to the new attr
>>>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES(Michael)
>>>> (5) Add comments for reporting the device status as LE(Michael)
>>>>
>>>> Changes from V3:
>>>> (1)drop the fixes tags(Parva)
>>>> (2)better commit log for patch 1/6(Michael)
>>>> (3)assign num_queues to max_supported_vqs than max_vq_pairs(Jason)
>>>> (4)initialize virtio pci capabilities in the probe() function.
>>>>
>>>> Changes from V2:
>>>> Add fixes tags(Parva)
>>>>
>>>> Changes from V1:
>>>> (1) Use __virito16_to_cpu(true, xxx) for the le16 casting(Jason)
>>>> (2) Add a comment in ifcvf_get_config_size(), to explain
>>>> why we should return the minimum value of
>>>> sizeof(struct virtio_net_config) and the onboard
>>>> cap size(Jason)
>>>> (3) Introduced a new attr VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES
>>>> (4) Show the changes of iproute2 output before and after 5/6 
>>>> patch(Jason)
>>>> (5) Fix cast warning in vdpa_fill_stats_rec()
>>>>
>>>> Zhu Lingshan (6):
>>>>    vDPA/ifcvf: get_config_size should return a value no greater 
>>>> than dev
>>>>      implementation
>>>>    vDPA/ifcvf: support userspace to query features and MQ of a 
>>>> management
>>>>      device
>>>>    vDPA: allow userspace to query features of a vDPA device
>>>>    vDPA: !FEATURES_OK should not block querying device config space
>>>>    vDPA: Conditionally read fields in virtio-net dev config space
>>>>    fix 'cast to restricted le16' warnings in vdpa.c
>>>>
>>>>   drivers/vdpa/ifcvf/ifcvf_base.c |  13 ++-
>>>>   drivers/vdpa/ifcvf/ifcvf_base.h |   2 +
>>>>   drivers/vdpa/ifcvf/ifcvf_main.c | 142 
>>>> +++++++++++++++++---------------
>>>>   drivers/vdpa/vdpa.c             |  82 ++++++++++++------
>>>>   include/uapi/linux/vdpa.h       |   3 +
>>>>   5 files changed, 149 insertions(+), 93 deletions(-)
>>>>
>>>> -- 
>>>> 2.31.1
>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-12 10:44 ` [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space Zhu Lingshan
@ 2022-08-16  7:41     ` Si-Wei Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Si-Wei Liu @ 2022-08-16  7:41 UTC (permalink / raw)
  To: mst, Zhu Lingshan; +Cc: kvm, netdev, virtualization, xieyongji, gautam.dawar

Hi Michael,

I just noticed this patch got pulled to linux-next prematurely without 
getting consensus on code review, am not sure why. Hope it was just an 
oversight.

Unfortunately this introduced functionality regression to at least two 
cases so far as I see:

1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently exposed 
and displayed in "vdpa dev config show" before feature negotiation is 
done. Noted the corresponding features name shown in vdpa tool is called 
"negotiated_features" rather than "driver_features". I see in no way the 
intended change of the patch should break this user level expectation 
regardless of any spec requirement. Do you agree on this point?

2. There was also another implicit assumption that is broken by this 
patch. There could be a vdpa tool query of config via 
vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races with 
the first vdpa_set_features() call from VMM e.g. QEMU. Since the 
S_FEATURES_OK blocking condition is removed, if the vdpa tool query 
occurs earlier than the first set_driver_features() call from VMM, the 
following code will treat the guest as legacy and then trigger an 
erroneous vdpa_set_features_unlocked(... , 0) call to the vdpa driver:

  374         /*
  375          * Config accesses aren't supposed to trigger before 
features are set.
  376          * If it does happen we assume a legacy guest.
  377          */
  378         if (!vdev->features_valid)
  379                 vdpa_set_features_unlocked(vdev, 0);
  380         ops->get_config(vdev, offset, buf, len);

Depending on vendor driver's implementation, L380 may either return 
invalid config data (or invalid endianness if on BE) or only config 
fields that are valid in legacy layout. What's more severe is that, vdpa 
tool query in theory shouldn't affect feature negotiation at all by 
making confusing calls to the device, but now it is possible with the 
patch. Fixing this would require more delicate work on the other paths 
involving the cf_lock reader/write semaphore.

Not sure what you plan to do next, post the fixes for both issues and 
get the community review? Or simply revert the patch in question? Let us 
know.

Thanks,
-Siwei


On 8/12/2022 3:44 AM, Zhu Lingshan wrote:
> Users may want to query the config space of a vDPA device,
> to choose a appropriate one for a certain guest. This means the
> users need to read the config space before FEATURES_OK, and
> the existence of config space contents does not depend on
> FEATURES_OK.
>
> The spec says:
> The device MUST allow reading of any device-specific configuration
> field before FEATURES_OK is set by the driver. This includes
> fields which are conditional on feature bits, as long as those
> feature bits are offered by the device.
>
> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> ---
>   drivers/vdpa/vdpa.c | 8 --------
>   1 file changed, 8 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index 6eb3d972d802..bf312d9c59ab 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid,
>   {
>   	u32 device_id;
>   	void *hdr;
> -	u8 status;
>   	int err;
>   
>   	down_read(&vdev->cf_lock);
> -	status = vdev->config->get_status(vdev);
> -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
> -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not completed");
> -		err = -EAGAIN;
> -		goto out;
> -	}
> -
>   	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>   			  VDPA_CMD_DEV_CONFIG_GET);
>   	if (!hdr) {

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
@ 2022-08-16  7:41     ` Si-Wei Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Si-Wei Liu @ 2022-08-16  7:41 UTC (permalink / raw)
  To: mst, Zhu Lingshan
  Cc: virtualization, netdev, kvm, parav, xieyongji, gautam.dawar, jasowang

Hi Michael,

I just noticed this patch got pulled to linux-next prematurely without 
getting consensus on code review, am not sure why. Hope it was just an 
oversight.

Unfortunately this introduced functionality regression to at least two 
cases so far as I see:

1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently exposed 
and displayed in "vdpa dev config show" before feature negotiation is 
done. Noted the corresponding features name shown in vdpa tool is called 
"negotiated_features" rather than "driver_features". I see in no way the 
intended change of the patch should break this user level expectation 
regardless of any spec requirement. Do you agree on this point?

2. There was also another implicit assumption that is broken by this 
patch. There could be a vdpa tool query of config via 
vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races with 
the first vdpa_set_features() call from VMM e.g. QEMU. Since the 
S_FEATURES_OK blocking condition is removed, if the vdpa tool query 
occurs earlier than the first set_driver_features() call from VMM, the 
following code will treat the guest as legacy and then trigger an 
erroneous vdpa_set_features_unlocked(... , 0) call to the vdpa driver:

  374         /*
  375          * Config accesses aren't supposed to trigger before 
features are set.
  376          * If it does happen we assume a legacy guest.
  377          */
  378         if (!vdev->features_valid)
  379                 vdpa_set_features_unlocked(vdev, 0);
  380         ops->get_config(vdev, offset, buf, len);

Depending on vendor driver's implementation, L380 may either return 
invalid config data (or invalid endianness if on BE) or only config 
fields that are valid in legacy layout. What's more severe is that, vdpa 
tool query in theory shouldn't affect feature negotiation at all by 
making confusing calls to the device, but now it is possible with the 
patch. Fixing this would require more delicate work on the other paths 
involving the cf_lock reader/write semaphore.

Not sure what you plan to do next, post the fixes for both issues and 
get the community review? Or simply revert the patch in question? Let us 
know.

Thanks,
-Siwei


On 8/12/2022 3:44 AM, Zhu Lingshan wrote:
> Users may want to query the config space of a vDPA device,
> to choose a appropriate one for a certain guest. This means the
> users need to read the config space before FEATURES_OK, and
> the existence of config space contents does not depend on
> FEATURES_OK.
>
> The spec says:
> The device MUST allow reading of any device-specific configuration
> field before FEATURES_OK is set by the driver. This includes
> fields which are conditional on feature bits, as long as those
> feature bits are offered by the device.
>
> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> ---
>   drivers/vdpa/vdpa.c | 8 --------
>   1 file changed, 8 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index 6eb3d972d802..bf312d9c59ab 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid,
>   {
>   	u32 device_id;
>   	void *hdr;
> -	u8 status;
>   	int err;
>   
>   	down_read(&vdev->cf_lock);
> -	status = vdev->config->get_status(vdev);
> -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
> -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not completed");
> -		err = -EAGAIN;
> -		goto out;
> -	}
> -
>   	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>   			  VDPA_CMD_DEV_CONFIG_GET);
>   	if (!hdr) {


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-16  7:41     ` Si-Wei Liu
@ 2022-08-16  8:23       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-08-16  8:23 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: kvm, netdev, virtualization, xieyongji, gautam.dawar, Zhu Lingshan

On Tue, Aug 16, 2022 at 12:41:21AM -0700, Si-Wei Liu wrote:
> Hi Michael,
> 
> I just noticed this patch got pulled to linux-next prematurely without
> getting consensus on code review, am not sure why. Hope it was just an
> oversight.
> 
> Unfortunately this introduced functionality regression to at least two cases
> so far as I see:
> 
> 1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently exposed and
> displayed in "vdpa dev config show" before feature negotiation is done.
> Noted the corresponding features name shown in vdpa tool is called
> "negotiated_features" rather than "driver_features". I see in no way the
> intended change of the patch should break this user level expectation
> regardless of any spec requirement. Do you agree on this point?
> 
> 2. There was also another implicit assumption that is broken by this patch.
> There could be a vdpa tool query of config via
> vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races with the
> first vdpa_set_features() call from VMM e.g. QEMU. Since the S_FEATURES_OK
> blocking condition is removed, if the vdpa tool query occurs earlier than
> the first set_driver_features() call from VMM, the following code will treat
> the guest as legacy and then trigger an erroneous
> vdpa_set_features_unlocked(... , 0) call to the vdpa driver:
> 
>  374         /*
>  375          * Config accesses aren't supposed to trigger before features
> are set.
>  376          * If it does happen we assume a legacy guest.
>  377          */
>  378         if (!vdev->features_valid)
>  379                 vdpa_set_features_unlocked(vdev, 0);
>  380         ops->get_config(vdev, offset, buf, len);
> Depending on vendor driver's implementation, L380 may either return invalid
> config data (or invalid endianness if on BE) or only config fields that are
> valid in legacy layout. What's more severe is that, vdpa tool query in
> theory shouldn't affect feature negotiation at all by making confusing calls
> to the device, but now it is possible with the patch. Fixing this would
> require more delicate work on the other paths involving the cf_lock
> reader/write semaphore.
> 
> Not sure what you plan to do next, post the fixes for both issues and get
> the community review? Or simply revert the patch in question? Let us know.
> 
> Thanks,
> -Siwei

If we can get fixes that's good. If not I can apply a revert.
I'm on vacation next week, you guys will have the time
to figure out the best plan of action.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
@ 2022-08-16  8:23       ` Michael S. Tsirkin
  0 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-08-16  8:23 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: Zhu Lingshan, virtualization, netdev, kvm, parav, xieyongji,
	gautam.dawar, jasowang

On Tue, Aug 16, 2022 at 12:41:21AM -0700, Si-Wei Liu wrote:
> Hi Michael,
> 
> I just noticed this patch got pulled to linux-next prematurely without
> getting consensus on code review, am not sure why. Hope it was just an
> oversight.
> 
> Unfortunately this introduced functionality regression to at least two cases
> so far as I see:
> 
> 1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently exposed and
> displayed in "vdpa dev config show" before feature negotiation is done.
> Noted the corresponding features name shown in vdpa tool is called
> "negotiated_features" rather than "driver_features". I see in no way the
> intended change of the patch should break this user level expectation
> regardless of any spec requirement. Do you agree on this point?
> 
> 2. There was also another implicit assumption that is broken by this patch.
> There could be a vdpa tool query of config via
> vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races with the
> first vdpa_set_features() call from VMM e.g. QEMU. Since the S_FEATURES_OK
> blocking condition is removed, if the vdpa tool query occurs earlier than
> the first set_driver_features() call from VMM, the following code will treat
> the guest as legacy and then trigger an erroneous
> vdpa_set_features_unlocked(... , 0) call to the vdpa driver:
> 
>  374         /*
>  375          * Config accesses aren't supposed to trigger before features
> are set.
>  376          * If it does happen we assume a legacy guest.
>  377          */
>  378         if (!vdev->features_valid)
>  379                 vdpa_set_features_unlocked(vdev, 0);
>  380         ops->get_config(vdev, offset, buf, len);
> Depending on vendor driver's implementation, L380 may either return invalid
> config data (or invalid endianness if on BE) or only config fields that are
> valid in legacy layout. What's more severe is that, vdpa tool query in
> theory shouldn't affect feature negotiation at all by making confusing calls
> to the device, but now it is possible with the patch. Fixing this would
> require more delicate work on the other paths involving the cf_lock
> reader/write semaphore.
> 
> Not sure what you plan to do next, post the fixes for both issues and get
> the community review? Or simply revert the patch in question? Let us know.
> 
> Thanks,
> -Siwei

If we can get fixes that's good. If not I can apply a revert.
I'm on vacation next week, you guys will have the time
to figure out the best plan of action.

-- 
MST


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
       [not found]     ` <f2864c96-cddd-129e-7dd8-a3743fe7e0d0@intel.com>
@ 2022-08-16  8:41         ` Michael S. Tsirkin
  2022-08-16 22:48       ` Si-Wei Liu
  1 sibling, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-08-16  8:41 UTC (permalink / raw)
  To: Zhu, Lingshan; +Cc: kvm, netdev, virtualization, xieyongji, gautam.dawar

On Tue, Aug 16, 2022 at 04:29:04PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 8/16/2022 3:41 PM, Si-Wei Liu wrote:
> 
>     Hi Michael,
> 
>     I just noticed this patch got pulled to linux-next prematurely without
>     getting consensus on code review, am not sure why. Hope it was just an
>     oversight.
> 
>     Unfortunately this introduced functionality regression to at least two
>     cases so far as I see:
> 
>     1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently exposed and
>     displayed in "vdpa dev config show" before feature negotiation is done.
>     Noted the corresponding features name shown in vdpa tool is called
>     "negotiated_features" rather than "driver_features". I see in no way the
>     intended change of the patch should break this user level expectation
>     regardless of any spec requirement. Do you agree on this point?
> 
> I will post a patch for iptour2, doing:
> 1) if iprout2 does not get driver_features from the kernel, then don't show
> negotiated features in the command output
> 2) process and decoding the device features.
> 
> 
>     2. There was also another implicit assumption that is broken by this patch.
>     There could be a vdpa tool query of config via vdpa_dev_net_config_fill()->
>     vdpa_get_config_unlocked() that races with the first vdpa_set_features()
>     call from VMM e.g. QEMU. Since the S_FEATURES_OK blocking condition is
>     removed, if the vdpa tool query occurs earlier than the first
>     set_driver_features() call from VMM, the following code will treat the
>     guest as legacy and then trigger an erroneous vdpa_set_features_unlocked
>     (... , 0) call to the vdpa driver:
> 
>      374         /*
>      375          * Config accesses aren't supposed to trigger before features
>     are set.
>      376          * If it does happen we assume a legacy guest.
>      377          */
>      378         if (!vdev->features_valid)
>      379                 vdpa_set_features_unlocked(vdev, 0);
>      380         ops->get_config(vdev, offset, buf, len);
> 
>     Depending on vendor driver's implementation, L380 may either return invalid
>     config data (or invalid endianness if on BE) or only config fields that are
>     valid in legacy layout. What's more severe is that, vdpa tool query in
>     theory shouldn't affect feature negotiation at all by making confusing
>     calls to the device, but now it is possible with the patch. Fixing this
>     would require more delicate work on the other paths involving the cf_lock
>     reader/write semaphore.
> 
>     Not sure what you plan to do next, post the fixes for both issues and get
>     the community review? Or simply revert the patch in question? Let us know.
> 
> The spec says:
> The device MUST allow reading of any device-specific configuration field before
> FEATURES_OK is set by
> the driver. This includes fields which are conditional on feature bits, as long
> as those feature bits are offered
> by the device.
> 
> so whether FEATURES_OK should not block reading the device config space. 
> vdpa_get_config_unlocked() will read the features, I don't know why it has a
> comment:
>         /*
>          * Config accesses aren't supposed to trigger before features are set.
>          * If it does happen we assume a legacy guest.
>          */
> 
> This conflicts with the spec.

Yea well. On the other hand the spec also calls for features to be
used to detect legacy versus modern driver.
This part of the spec needs work generally.


> vdpa_get_config_unlocked() checks vdev->features_valid, if not valid, it will
> set the drivers_features 0, I think this intends to prevent reading random
> driver_features. This function does not hold any locks, and didn't change
> anything.
> 
> So what is the race?
> 
> Thanks
> 
> 
> 
>     Thanks,
>     -Siwei
> 
> 
>     On 8/12/2022 3:44 AM, Zhu Lingshan wrote:
> 
>         Users may want to query the config space of a vDPA device,
>         to choose a appropriate one for a certain guest. This means the
>         users need to read the config space before FEATURES_OK, and
>         the existence of config space contents does not depend on
>         FEATURES_OK.
> 
>         The spec says:
>         The device MUST allow reading of any device-specific configuration
>         field before FEATURES_OK is set by the driver. This includes
>         fields which are conditional on feature bits, as long as those
>         feature bits are offered by the device.
> 
>         Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>         ---
>           drivers/vdpa/vdpa.c | 8 --------
>           1 file changed, 8 deletions(-)
> 
>         diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>         index 6eb3d972d802..bf312d9c59ab 100644
>         --- a/drivers/vdpa/vdpa.c
>         +++ b/drivers/vdpa/vdpa.c
>         @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>         struct sk_buff *msg, u32 portid,
>           {
>               u32 device_id;
>               void *hdr;
>         -    u8 status;
>               int err;
>                 down_read(&vdev->cf_lock);
>         -    status = vdev->config->get_status(vdev);
>         -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>         -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>         completed");
>         -        err = -EAGAIN;
>         -        goto out;
>         -    }
>         -
>               hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>                         VDPA_CMD_DEV_CONFIG_GET);
>               if (!hdr) {
> 
> 
> 
> 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
@ 2022-08-16  8:41         ` Michael S. Tsirkin
  0 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-08-16  8:41 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: Si-Wei Liu, virtualization, netdev, kvm, parav, xieyongji,
	gautam.dawar, jasowang

On Tue, Aug 16, 2022 at 04:29:04PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 8/16/2022 3:41 PM, Si-Wei Liu wrote:
> 
>     Hi Michael,
> 
>     I just noticed this patch got pulled to linux-next prematurely without
>     getting consensus on code review, am not sure why. Hope it was just an
>     oversight.
> 
>     Unfortunately this introduced functionality regression to at least two
>     cases so far as I see:
> 
>     1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently exposed and
>     displayed in "vdpa dev config show" before feature negotiation is done.
>     Noted the corresponding features name shown in vdpa tool is called
>     "negotiated_features" rather than "driver_features". I see in no way the
>     intended change of the patch should break this user level expectation
>     regardless of any spec requirement. Do you agree on this point?
> 
> I will post a patch for iptour2, doing:
> 1) if iprout2 does not get driver_features from the kernel, then don't show
> negotiated features in the command output
> 2) process and decoding the device features.
> 
> 
>     2. There was also another implicit assumption that is broken by this patch.
>     There could be a vdpa tool query of config via vdpa_dev_net_config_fill()->
>     vdpa_get_config_unlocked() that races with the first vdpa_set_features()
>     call from VMM e.g. QEMU. Since the S_FEATURES_OK blocking condition is
>     removed, if the vdpa tool query occurs earlier than the first
>     set_driver_features() call from VMM, the following code will treat the
>     guest as legacy and then trigger an erroneous vdpa_set_features_unlocked
>     (... , 0) call to the vdpa driver:
> 
>      374         /*
>      375          * Config accesses aren't supposed to trigger before features
>     are set.
>      376          * If it does happen we assume a legacy guest.
>      377          */
>      378         if (!vdev->features_valid)
>      379                 vdpa_set_features_unlocked(vdev, 0);
>      380         ops->get_config(vdev, offset, buf, len);
> 
>     Depending on vendor driver's implementation, L380 may either return invalid
>     config data (or invalid endianness if on BE) or only config fields that are
>     valid in legacy layout. What's more severe is that, vdpa tool query in
>     theory shouldn't affect feature negotiation at all by making confusing
>     calls to the device, but now it is possible with the patch. Fixing this
>     would require more delicate work on the other paths involving the cf_lock
>     reader/write semaphore.
> 
>     Not sure what you plan to do next, post the fixes for both issues and get
>     the community review? Or simply revert the patch in question? Let us know.
> 
> The spec says:
> The device MUST allow reading of any device-specific configuration field before
> FEATURES_OK is set by
> the driver. This includes fields which are conditional on feature bits, as long
> as those feature bits are offered
> by the device.
> 
> so whether FEATURES_OK should not block reading the device config space. 
> vdpa_get_config_unlocked() will read the features, I don't know why it has a
> comment:
>         /*
>          * Config accesses aren't supposed to trigger before features are set.
>          * If it does happen we assume a legacy guest.
>          */
> 
> This conflicts with the spec.

Yea well. On the other hand the spec also calls for features to be
used to detect legacy versus modern driver.
This part of the spec needs work generally.


> vdpa_get_config_unlocked() checks vdev->features_valid, if not valid, it will
> set the drivers_features 0, I think this intends to prevent reading random
> driver_features. This function does not hold any locks, and didn't change
> anything.
> 
> So what is the race?
> 
> Thanks
> 
> 
> 
>     Thanks,
>     -Siwei
> 
> 
>     On 8/12/2022 3:44 AM, Zhu Lingshan wrote:
> 
>         Users may want to query the config space of a vDPA device,
>         to choose a appropriate one for a certain guest. This means the
>         users need to read the config space before FEATURES_OK, and
>         the existence of config space contents does not depend on
>         FEATURES_OK.
> 
>         The spec says:
>         The device MUST allow reading of any device-specific configuration
>         field before FEATURES_OK is set by the driver. This includes
>         fields which are conditional on feature bits, as long as those
>         feature bits are offered by the device.
> 
>         Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>         ---
>           drivers/vdpa/vdpa.c | 8 --------
>           1 file changed, 8 deletions(-)
> 
>         diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>         index 6eb3d972d802..bf312d9c59ab 100644
>         --- a/drivers/vdpa/vdpa.c
>         +++ b/drivers/vdpa/vdpa.c
>         @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>         struct sk_buff *msg, u32 portid,
>           {
>               u32 device_id;
>               void *hdr;
>         -    u8 status;
>               int err;
>                 down_read(&vdev->cf_lock);
>         -    status = vdev->config->get_status(vdev);
>         -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>         -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>         completed");
>         -        err = -EAGAIN;
>         -        goto out;
>         -    }
>         -
>               hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>                         VDPA_CMD_DEV_CONFIG_GET);
>               if (!hdr) {
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-16  8:41         ` Michael S. Tsirkin
  (?)
@ 2022-08-16  8:46         ` Zhu, Lingshan
  -1 siblings, 0 replies; 30+ messages in thread
From: Zhu, Lingshan @ 2022-08-16  8:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Si-Wei Liu, virtualization, netdev, kvm, parav, xieyongji,
	gautam.dawar, jasowang



On 8/16/2022 4:41 PM, Michael S. Tsirkin wrote:
> On Tue, Aug 16, 2022 at 04:29:04PM +0800, Zhu, Lingshan wrote:
>>
>> On 8/16/2022 3:41 PM, Si-Wei Liu wrote:
>>
>>      Hi Michael,
>>
>>      I just noticed this patch got pulled to linux-next prematurely without
>>      getting consensus on code review, am not sure why. Hope it was just an
>>      oversight.
>>
>>      Unfortunately this introduced functionality regression to at least two
>>      cases so far as I see:
>>
>>      1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently exposed and
>>      displayed in "vdpa dev config show" before feature negotiation is done.
>>      Noted the corresponding features name shown in vdpa tool is called
>>      "negotiated_features" rather than "driver_features". I see in no way the
>>      intended change of the patch should break this user level expectation
>>      regardless of any spec requirement. Do you agree on this point?
>>
>> I will post a patch for iptour2, doing:
>> 1) if iprout2 does not get driver_features from the kernel, then don't show
>> negotiated features in the command output
>> 2) process and decoding the device features.
>>
>>
>>      2. There was also another implicit assumption that is broken by this patch.
>>      There could be a vdpa tool query of config via vdpa_dev_net_config_fill()->
>>      vdpa_get_config_unlocked() that races with the first vdpa_set_features()
>>      call from VMM e.g. QEMU. Since the S_FEATURES_OK blocking condition is
>>      removed, if the vdpa tool query occurs earlier than the first
>>      set_driver_features() call from VMM, the following code will treat the
>>      guest as legacy and then trigger an erroneous vdpa_set_features_unlocked
>>      (... , 0) call to the vdpa driver:
>>
>>       374         /*
>>       375          * Config accesses aren't supposed to trigger before features
>>      are set.
>>       376          * If it does happen we assume a legacy guest.
>>       377          */
>>       378         if (!vdev->features_valid)
>>       379                 vdpa_set_features_unlocked(vdev, 0);
>>       380         ops->get_config(vdev, offset, buf, len);
>>
>>      Depending on vendor driver's implementation, L380 may either return invalid
>>      config data (or invalid endianness if on BE) or only config fields that are
>>      valid in legacy layout. What's more severe is that, vdpa tool query in
>>      theory shouldn't affect feature negotiation at all by making confusing
>>      calls to the device, but now it is possible with the patch. Fixing this
>>      would require more delicate work on the other paths involving the cf_lock
>>      reader/write semaphore.
>>
>>      Not sure what you plan to do next, post the fixes for both issues and get
>>      the community review? Or simply revert the patch in question? Let us know.
>>
>> The spec says:
>> The device MUST allow reading of any device-specific configuration field before
>> FEATURES_OK is set by
>> the driver. This includes fields which are conditional on feature bits, as long
>> as those feature bits are offered
>> by the device.
>>
>> so whether FEATURES_OK should not block reading the device config space.
>> vdpa_get_config_unlocked() will read the features, I don't know why it has a
>> comment:
>>          /*
>>           * Config accesses aren't supposed to trigger before features are set.
>>           * If it does happen we assume a legacy guest.
>>           */
>>
>> This conflicts with the spec.
> Yea well. On the other hand the spec also calls for features to be
> used to detect legacy versus modern driver.
> This part of the spec needs work generally.
so from what I see, there are no race conditions, if features 
negotiation not done,
just assume the driver features are all zero, then return the device 
config space contents.
It can do this even without this comment.

Please help correct me if I misunderstand these

Thanks
Zhu Lingshan
>
>
>> vdpa_get_config_unlocked() checks vdev->features_valid, if not valid, it will
>> set the drivers_features 0, I think this intends to prevent reading random
>> driver_features. This function does not hold any locks, and didn't change
>> anything.
>>
>> So what is the race?
>>
>> Thanks
>>
>>
>>
>>      Thanks,
>>      -Siwei
>>
>>
>>      On 8/12/2022 3:44 AM, Zhu Lingshan wrote:
>>
>>          Users may want to query the config space of a vDPA device,
>>          to choose a appropriate one for a certain guest. This means the
>>          users need to read the config space before FEATURES_OK, and
>>          the existence of config space contents does not depend on
>>          FEATURES_OK.
>>
>>          The spec says:
>>          The device MUST allow reading of any device-specific configuration
>>          field before FEATURES_OK is set by the driver. This includes
>>          fields which are conditional on feature bits, as long as those
>>          feature bits are offered by the device.
>>
>>          Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>          ---
>>            drivers/vdpa/vdpa.c | 8 --------
>>            1 file changed, 8 deletions(-)
>>
>>          diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>>          index 6eb3d972d802..bf312d9c59ab 100644
>>          --- a/drivers/vdpa/vdpa.c
>>          +++ b/drivers/vdpa/vdpa.c
>>          @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>>          struct sk_buff *msg, u32 portid,
>>            {
>>                u32 device_id;
>>                void *hdr;
>>          -    u8 status;
>>                int err;
>>                  down_read(&vdev->cf_lock);
>>          -    status = vdev->config->get_status(vdev);
>>          -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>          -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>          completed");
>>          -        err = -EAGAIN;
>>          -        goto out;
>>          -    }
>>          -
>>                hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>                          VDPA_CMD_DEV_CONFIG_GET);
>>                if (!hdr) {
>>
>>
>>
>>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-16  7:41     ` Si-Wei Liu
@ 2022-08-16 21:13       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-08-16 21:13 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: Zhu Lingshan, virtualization, netdev, kvm, parav, xieyongji,
	gautam.dawar, jasowang

On Tue, Aug 16, 2022 at 12:41:21AM -0700, Si-Wei Liu wrote:
> Hi Michael,
> 
> I just noticed this patch got pulled to linux-next prematurely without
> getting consensus on code review, am not sure why. Hope it was just an
> oversight.
> 
> Unfortunately this introduced functionality regression to at least two cases
> so far as I see:
> 
> 1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently exposed and
> displayed in "vdpa dev config show" before feature negotiation is done.
> Noted the corresponding features name shown in vdpa tool is called
> "negotiated_features" rather than "driver_features". I see in no way the
> intended change of the patch should break this user level expectation
> regardless of any spec requirement. Do you agree on this point?
> 
> 2. There was also another implicit assumption that is broken by this patch.
> There could be a vdpa tool query of config via
> vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races with the
> first vdpa_set_features() call from VMM e.g. QEMU. Since the S_FEATURES_OK
> blocking condition is removed, if the vdpa tool query occurs earlier than
> the first set_driver_features() call from VMM, the following code will treat
> the guest as legacy and then trigger an erroneous
> vdpa_set_features_unlocked(... , 0) call to the vdpa driver:
> 
>  374         /*
>  375          * Config accesses aren't supposed to trigger before features
> are set.
>  376          * If it does happen we assume a legacy guest.
>  377          */
>  378         if (!vdev->features_valid)
>  379                 vdpa_set_features_unlocked(vdev, 0);
>  380         ops->get_config(vdev, offset, buf, len);
> 
> Depending on vendor driver's implementation, L380 may either return invalid
> config data (or invalid endianness if on BE) or only config fields that are
> valid in legacy layout. What's more severe is that, vdpa tool query in
> theory shouldn't affect feature negotiation at all by making confusing calls
> to the device, but now it is possible with the patch. Fixing this would
> require more delicate work on the other paths involving the cf_lock
> reader/write semaphore.
> 
> Not sure what you plan to do next, post the fixes for both issues and get
> the community review? Or simply revert the patch in question? Let us know.
> 
> Thanks,
> -Siwei
> 

I'm not sure who you are asking. I didn't realize this is so
controversial. If you feel it should be reverted I suggest
you post a revert patch with a detailed motivation and this
will get the discussion going.
It will also help if you stress whether you describe theoretical
issues or something observed in practice above
discussion does not make this clear.

-- 
MST


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
@ 2022-08-16 21:13       ` Michael S. Tsirkin
  0 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-08-16 21:13 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: kvm, netdev, virtualization, xieyongji, gautam.dawar, Zhu Lingshan

On Tue, Aug 16, 2022 at 12:41:21AM -0700, Si-Wei Liu wrote:
> Hi Michael,
> 
> I just noticed this patch got pulled to linux-next prematurely without
> getting consensus on code review, am not sure why. Hope it was just an
> oversight.
> 
> Unfortunately this introduced functionality regression to at least two cases
> so far as I see:
> 
> 1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently exposed and
> displayed in "vdpa dev config show" before feature negotiation is done.
> Noted the corresponding features name shown in vdpa tool is called
> "negotiated_features" rather than "driver_features". I see in no way the
> intended change of the patch should break this user level expectation
> regardless of any spec requirement. Do you agree on this point?
> 
> 2. There was also another implicit assumption that is broken by this patch.
> There could be a vdpa tool query of config via
> vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races with the
> first vdpa_set_features() call from VMM e.g. QEMU. Since the S_FEATURES_OK
> blocking condition is removed, if the vdpa tool query occurs earlier than
> the first set_driver_features() call from VMM, the following code will treat
> the guest as legacy and then trigger an erroneous
> vdpa_set_features_unlocked(... , 0) call to the vdpa driver:
> 
>  374         /*
>  375          * Config accesses aren't supposed to trigger before features
> are set.
>  376          * If it does happen we assume a legacy guest.
>  377          */
>  378         if (!vdev->features_valid)
>  379                 vdpa_set_features_unlocked(vdev, 0);
>  380         ops->get_config(vdev, offset, buf, len);
> 
> Depending on vendor driver's implementation, L380 may either return invalid
> config data (or invalid endianness if on BE) or only config fields that are
> valid in legacy layout. What's more severe is that, vdpa tool query in
> theory shouldn't affect feature negotiation at all by making confusing calls
> to the device, but now it is possible with the patch. Fixing this would
> require more delicate work on the other paths involving the cf_lock
> reader/write semaphore.
> 
> Not sure what you plan to do next, post the fixes for both issues and get
> the community review? Or simply revert the patch in question? Let us know.
> 
> Thanks,
> -Siwei
> 

I'm not sure who you are asking. I didn't realize this is so
controversial. If you feel it should be reverted I suggest
you post a revert patch with a detailed motivation and this
will get the discussion going.
It will also help if you stress whether you describe theoretical
issues or something observed in practice above
discussion does not make this clear.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-16 21:13       ` Michael S. Tsirkin
@ 2022-08-16 22:09         ` Si-Wei Liu
  -1 siblings, 0 replies; 30+ messages in thread
From: Si-Wei Liu @ 2022-08-16 22:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Zhu Lingshan, virtualization, netdev, kvm, parav, xieyongji,
	gautam.dawar, jasowang



On 8/16/2022 2:13 PM, Michael S. Tsirkin wrote:
> On Tue, Aug 16, 2022 at 12:41:21AM -0700, Si-Wei Liu wrote:
>> Hi Michael,
>>
>> I just noticed this patch got pulled to linux-next prematurely without
>> getting consensus on code review, am not sure why. Hope it was just an
>> oversight.
>>
>> Unfortunately this introduced functionality regression to at least two cases
>> so far as I see:
>>
>> 1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently exposed and
>> displayed in "vdpa dev config show" before feature negotiation is done.
>> Noted the corresponding features name shown in vdpa tool is called
>> "negotiated_features" rather than "driver_features". I see in no way the
>> intended change of the patch should break this user level expectation
>> regardless of any spec requirement. Do you agree on this point?
>>
>> 2. There was also another implicit assumption that is broken by this patch.
>> There could be a vdpa tool query of config via
>> vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races with the
>> first vdpa_set_features() call from VMM e.g. QEMU. Since the S_FEATURES_OK
>> blocking condition is removed, if the vdpa tool query occurs earlier than
>> the first set_driver_features() call from VMM, the following code will treat
>> the guest as legacy and then trigger an erroneous
>> vdpa_set_features_unlocked(... , 0) call to the vdpa driver:
>>
>>   374         /*
>>   375          * Config accesses aren't supposed to trigger before features
>> are set.
>>   376          * If it does happen we assume a legacy guest.
>>   377          */
>>   378         if (!vdev->features_valid)
>>   379                 vdpa_set_features_unlocked(vdev, 0);
>>   380         ops->get_config(vdev, offset, buf, len);
>>
>> Depending on vendor driver's implementation, L380 may either return invalid
>> config data (or invalid endianness if on BE) or only config fields that are
>> valid in legacy layout. What's more severe is that, vdpa tool query in
>> theory shouldn't affect feature negotiation at all by making confusing calls
>> to the device, but now it is possible with the patch. Fixing this would
>> require more delicate work on the other paths involving the cf_lock
>> reader/write semaphore.
>>
>> Not sure what you plan to do next, post the fixes for both issues and get
>> the community review? Or simply revert the patch in question? Let us know.
>>
>> Thanks,
>> -Siwei
>>
> I'm not sure who you are asking. I didn't realize this is so
> controversial. If you feel it should be reverted I suggest
> you post a revert patch with a detailed motivation and this
> will get the discussion going.
Leave it around then, until the next person shout out aloud. I don't 
mind taking personal time to help, though my impression of the past 
conversation is that this is less productive way of cooperation and 
collaboration.

Please safely ignore me from now on.

-Siwei


> It will also help if you stress whether you describe theoretical
> issues or something observed in practice above
> discussion does not make this clear.
>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
@ 2022-08-16 22:09         ` Si-Wei Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Si-Wei Liu @ 2022-08-16 22:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, netdev, virtualization, xieyongji, gautam.dawar, Zhu Lingshan



On 8/16/2022 2:13 PM, Michael S. Tsirkin wrote:
> On Tue, Aug 16, 2022 at 12:41:21AM -0700, Si-Wei Liu wrote:
>> Hi Michael,
>>
>> I just noticed this patch got pulled to linux-next prematurely without
>> getting consensus on code review, am not sure why. Hope it was just an
>> oversight.
>>
>> Unfortunately this introduced functionality regression to at least two cases
>> so far as I see:
>>
>> 1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently exposed and
>> displayed in "vdpa dev config show" before feature negotiation is done.
>> Noted the corresponding features name shown in vdpa tool is called
>> "negotiated_features" rather than "driver_features". I see in no way the
>> intended change of the patch should break this user level expectation
>> regardless of any spec requirement. Do you agree on this point?
>>
>> 2. There was also another implicit assumption that is broken by this patch.
>> There could be a vdpa tool query of config via
>> vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races with the
>> first vdpa_set_features() call from VMM e.g. QEMU. Since the S_FEATURES_OK
>> blocking condition is removed, if the vdpa tool query occurs earlier than
>> the first set_driver_features() call from VMM, the following code will treat
>> the guest as legacy and then trigger an erroneous
>> vdpa_set_features_unlocked(... , 0) call to the vdpa driver:
>>
>>   374         /*
>>   375          * Config accesses aren't supposed to trigger before features
>> are set.
>>   376          * If it does happen we assume a legacy guest.
>>   377          */
>>   378         if (!vdev->features_valid)
>>   379                 vdpa_set_features_unlocked(vdev, 0);
>>   380         ops->get_config(vdev, offset, buf, len);
>>
>> Depending on vendor driver's implementation, L380 may either return invalid
>> config data (or invalid endianness if on BE) or only config fields that are
>> valid in legacy layout. What's more severe is that, vdpa tool query in
>> theory shouldn't affect feature negotiation at all by making confusing calls
>> to the device, but now it is possible with the patch. Fixing this would
>> require more delicate work on the other paths involving the cf_lock
>> reader/write semaphore.
>>
>> Not sure what you plan to do next, post the fixes for both issues and get
>> the community review? Or simply revert the patch in question? Let us know.
>>
>> Thanks,
>> -Siwei
>>
> I'm not sure who you are asking. I didn't realize this is so
> controversial. If you feel it should be reverted I suggest
> you post a revert patch with a detailed motivation and this
> will get the discussion going.
Leave it around then, until the next person shout out aloud. I don't 
mind taking personal time to help, though my impression of the past 
conversation is that this is less productive way of cooperation and 
collaboration.

Please safely ignore me from now on.

-Siwei


> It will also help if you stress whether you describe theoretical
> issues or something observed in practice above
> discussion does not make this clear.
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
       [not found]     ` <f2864c96-cddd-129e-7dd8-a3743fe7e0d0@intel.com>
  2022-08-16  8:41         ` Michael S. Tsirkin
@ 2022-08-16 22:48       ` Si-Wei Liu
       [not found]         ` <a488a17a-b716-52aa-cc31-2e51f8f972d2@intel.com>
  1 sibling, 1 reply; 30+ messages in thread
From: Si-Wei Liu @ 2022-08-16 22:48 UTC (permalink / raw)
  To: Zhu, Lingshan, mst; +Cc: kvm, netdev, virtualization, xieyongji, gautam.dawar


[-- Attachment #1.1: Type: text/plain, Size: 6173 bytes --]



On 8/16/2022 1:29 AM, Zhu, Lingshan wrote:
>
>
> On 8/16/2022 3:41 PM, Si-Wei Liu wrote:
>> Hi Michael,
>>
>> I just noticed this patch got pulled to linux-next prematurely 
>> without getting consensus on code review, am not sure why. Hope it 
>> was just an oversight.
>>
>> Unfortunately this introduced functionality regression to at least 
>> two cases so far as I see:
>>
>> 1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently 
>> exposed and displayed in "vdpa dev config show" before feature 
>> negotiation is done. Noted the corresponding features name shown in 
>> vdpa tool is called "negotiated_features" rather than 
>> "driver_features". I see in no way the intended change of the patch 
>> should break this user level expectation regardless of any spec 
>> requirement. Do you agree on this point?
> I will post a patch for iptour2, doing:
> 1) if iprout2 does not get driver_features from the kernel, then don't 
> show negotiated features in the command output
This won't work as the vdpa userspace tool won't know *when* features 
are negotiated. There's no guarantee in the kernel to assume 0 will be 
returned from vendor driver during negotiation. On the other hand, with 
the supposed change, userspace can't tell if there's really none of 
features negotiated, or the feature negotiation is over. Before the 
change the userspace either gets all the attributes when feature 
negotiation is over, or it gets nothing when it's ongoing, so there was 
a distinction.This expectation of what "negotiated_features" represents 
is established from day one, I see no reason the intended kernel change 
to show other attributes should break userspace behavior and user's 
expectation.
> 2) process and decoding the device features.
>>
>> 2. There was also another implicit assumption that is broken by this 
>> patch. There could be a vdpa tool query of config via 
>> vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races 
>> with the first vdpa_set_features() call from VMM e.g. QEMU. Since the 
>> S_FEATURES_OK blocking condition is removed, if the vdpa tool query 
>> occurs earlier than the first set_driver_features() call from VMM, 
>> the following code will treat the guest as legacy and then trigger an 
>> erroneous vdpa_set_features_unlocked(... , 0) call to the vdpa driver:
>>
>>  374         /*
>>  375          * Config accesses aren't supposed to trigger before 
>> features are set.
>>  376          * If it does happen we assume a legacy guest.
>>  377          */
>>  378         if (!vdev->features_valid)
>>  379                 vdpa_set_features_unlocked(vdev, 0);
>>  380         ops->get_config(vdev, offset, buf, len);
>>
>> Depending on vendor driver's implementation, L380 may either return 
>> invalid config data (or invalid endianness if on BE) or only config 
>> fields that are valid in legacy layout. What's more severe is that, 
>> vdpa tool query in theory shouldn't affect feature negotiation at all 
>> by making confusing calls to the device, but now it is possible with 
>> the patch. Fixing this would require more delicate work on the other 
>> paths involving the cf_lock reader/write semaphore.
>>
>> Not sure what you plan to do next, post the fixes for both issues and 
>> get the community review? Or simply revert the patch in question? Let 
>> us know.
> The spec says:
> The device MUST allow reading of any device-specific configuration 
> field before FEATURES_OK is set by
> the driver. This includes fields which are conditional on feature 
> bits, as long as those feature bits are offered
> by the device.
>
> so whether FEATURES_OK should not block reading the device config 
> space. vdpa_get_config_unlocked() will read the features, I don't know 
> why it has a comment:
>         /*
>          * Config accesses aren't supposed to trigger before features 
> are set.
>          * If it does happen we assume a legacy guest.
>          */
>
> This conflicts with the spec.
>
> vdpa_get_config_unlocked() checks vdev->features_valid, if not valid, 
> it will set the drivers_features 0, I think this intends to prevent 
> reading random driver_features. This function does not hold any locks, 
> and didn't change anything.
>
> So what is the race?
You'll see the race if you keep 'vdpa dev config show ...' running in a 
tight loop while launching a VM with the vDPA device under query.

-Siwei


>
> Thanks
>
>>
>> Thanks,
>> -Siwei
>>
>>
>> On 8/12/2022 3:44 AM, Zhu Lingshan wrote:
>>> Users may want to query the config space of a vDPA device,
>>> to choose a appropriate one for a certain guest. This means the
>>> users need to read the config space before FEATURES_OK, and
>>> the existence of config space contents does not depend on
>>> FEATURES_OK.
>>>
>>> The spec says:
>>> The device MUST allow reading of any device-specific configuration
>>> field before FEATURES_OK is set by the driver. This includes
>>> fields which are conditional on feature bits, as long as those
>>> feature bits are offered by the device.
>>>
>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>> ---
>>>   drivers/vdpa/vdpa.c | 8 --------
>>>   1 file changed, 8 deletions(-)
>>>
>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>>> index 6eb3d972d802..bf312d9c59ab 100644
>>> --- a/drivers/vdpa/vdpa.c
>>> +++ b/drivers/vdpa/vdpa.c
>>> @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev, 
>>> struct sk_buff *msg, u32 portid,
>>>   {
>>>       u32 device_id;
>>>       void *hdr;
>>> -    u8 status;
>>>       int err;
>>>         down_read(&vdev->cf_lock);
>>> -    status = vdev->config->get_status(vdev);
>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not 
>>> completed");
>>> -        err = -EAGAIN;
>>> -        goto out;
>>> -    }
>>> -
>>>       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>                 VDPA_CMD_DEV_CONFIG_GET);
>>>       if (!hdr) {
>>
>

[-- Attachment #1.2: Type: text/html, Size: 10988 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
       [not found]         ` <a488a17a-b716-52aa-cc31-2e51f8f972d2@intel.com>
@ 2022-08-17  6:14             ` Michael S. Tsirkin
  0 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-08-17  6:14 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: Si-Wei Liu, virtualization, netdev, kvm, parav, xieyongji,
	gautam.dawar, jasowang

On Wed, Aug 17, 2022 at 10:11:36AM +0800, Zhu, Lingshan wrote:
> 
> 
> On 8/17/2022 6:48 AM, Si-Wei Liu wrote:
> 
> 
> 
>     On 8/16/2022 1:29 AM, Zhu, Lingshan wrote:
> 
> 
> 
>         On 8/16/2022 3:41 PM, Si-Wei Liu wrote:
> 
>             Hi Michael,
> 
>             I just noticed this patch got pulled to linux-next prematurely
>             without getting consensus on code review, am not sure why. Hope it
>             was just an oversight.
> 
>             Unfortunately this introduced functionality regression to at least
>             two cases so far as I see:
> 
>             1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently
>             exposed and displayed in "vdpa dev config show" before feature
>             negotiation is done. Noted the corresponding features name shown in
>             vdpa tool is called "negotiated_features" rather than
>             "driver_features". I see in no way the intended change of the patch
>             should break this user level expectation regardless of any spec
>             requirement. Do you agree on this point?
> 
>         I will post a patch for iptour2, doing:
>         1) if iprout2 does not get driver_features from the kernel, then don't
>         show negotiated features in the command output
> 
>     This won't work as the vdpa userspace tool won't know *when* features are
>     negotiated. There's no guarantee in the kernel to assume 0 will be returned
>     from vendor driver during negotiation. On the other hand, with the supposed
>     change, userspace can't tell if there's really none of features negotiated,
>     or the feature negotiation is over. Before the change the userspace either
>     gets all the attributes when feature negotiation is over, or it gets
>     nothing when it's ongoing, so there was a distinction.This expectation of
>     what "negotiated_features" represents is established from day one, I see no
>     reason the intended kernel change to show other attributes should break
>     userspace behavior and user's expectation.
> 
> User space can only read valid *driver_features* after the features negotiation
> is done, *device_features* does not require the negotiation.
> 
> If you want to prevent random values read from driver_features, here I propose
> a fix: only read driver_features when the negotiation is done, this means to
> check (status & VIRTIO_CONFIG_S_FEATURES_OK) before reading the
> driver_features.
> Sounds good?
> 
> @MST, if this is OK, I can include this change in my next version patch series.
> 
> Thanks,
> Zhu Lingshan

Sorry I don't get it. Is there going to be a new version? Do you want me
to revert this one and then apply a new one? It's ok if yes.


>         2) process and decoding the device features.
> 
> 
>             2. There was also another implicit assumption that is broken by
>             this patch. There could be a vdpa tool query of config via
>             vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races
>             with the first vdpa_set_features() call from VMM e.g. QEMU. Since
>             the S_FEATURES_OK blocking condition is removed, if the vdpa tool
>             query occurs earlier than the first set_driver_features() call from
>             VMM, the following code will treat the guest as legacy and then
>             trigger an erroneous vdpa_set_features_unlocked(... , 0) call to
>             the vdpa driver:
> 
>              374         /*
>              375          * Config accesses aren't supposed to trigger before
>             features are set.
>              376          * If it does happen we assume a legacy guest.
>              377          */
>              378         if (!vdev->features_valid)
>              379                 vdpa_set_features_unlocked(vdev, 0);
>              380         ops->get_config(vdev, offset, buf, len);
> 
>             Depending on vendor driver's implementation, L380 may either return
>             invalid config data (or invalid endianness if on BE) or only config
>             fields that are valid in legacy layout. What's more severe is that,
>             vdpa tool query in theory shouldn't affect feature negotiation at
>             all by making confusing calls to the device, but now it is possible
>             with the patch. Fixing this would require more delicate work on the
>             other paths involving the cf_lock reader/write semaphore.
> 
>             Not sure what you plan to do next, post the fixes for both issues
>             and get the community review? Or simply revert the patch in
>             question? Let us know.
> 
>         The spec says:
>         The device MUST allow reading of any device-specific configuration
>         field before FEATURES_OK is set by
>         the driver. This includes fields which are conditional on feature bits,
>         as long as those feature bits are offered
>         by the device.
> 
>         so whether FEATURES_OK should not block reading the device config
>         space. vdpa_get_config_unlocked() will read the features, I don't know
>         why it has a comment:
>                 /*
>                  * Config accesses aren't supposed to trigger before features
>         are set.
>                  * If it does happen we assume a legacy guest.
>                  */
> 
>         This conflicts with the spec.
> 
>         vdpa_get_config_unlocked() checks vdev->features_valid, if not valid,
>         it will set the drivers_features 0, I think this intends to prevent
>         reading random driver_features. This function does not hold any locks,
>         and didn't change anything.
> 
>         So what is the race?
>    
>     You'll see the race if you keep 'vdpa dev config show ...' running in a
>     tight loop while launching a VM with the vDPA device under query.
> 
>     -Siwei
> 
> 
> 
>        
>         Thanks
> 
>        
> 
>             Thanks,
>             -Siwei
> 
> 
>             On 8/12/2022 3:44 AM, Zhu Lingshan wrote:
> 
>                 Users may want to query the config space of a vDPA device,
>                 to choose a appropriate one for a certain guest. This means the
>                 users need to read the config space before FEATURES_OK, and
>                 the existence of config space contents does not depend on
>                 FEATURES_OK.
> 
>                 The spec says:
>                 The device MUST allow reading of any device-specific
>                 configuration
>                 field before FEATURES_OK is set by the driver. This includes
>                 fields which are conditional on feature bits, as long as those
>                 feature bits are offered by the device.
> 
>                 Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>                 ---
>                   drivers/vdpa/vdpa.c | 8 --------
>                   1 file changed, 8 deletions(-)
> 
>                 diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>                 index 6eb3d972d802..bf312d9c59ab 100644
>                 --- a/drivers/vdpa/vdpa.c
>                 +++ b/drivers/vdpa/vdpa.c
>                 @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct vdpa_device
>                 *vdev, struct sk_buff *msg, u32 portid,
>                   {
>                       u32 device_id;
>                       void *hdr;
>                 -    u8 status;
>                       int err;
>                         down_read(&vdev->cf_lock);
>                 -    status = vdev->config->get_status(vdev);
>                 -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>                 -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>                 completed");
>                 -        err = -EAGAIN;
>                 -        goto out;
>                 -    }
>                 -
>                       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family,
>                 flags,
>                                 VDPA_CMD_DEV_CONFIG_GET);
>                       if (!hdr) {
> 
> 
> 
> 
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
@ 2022-08-17  6:14             ` Michael S. Tsirkin
  0 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-08-17  6:14 UTC (permalink / raw)
  To: Zhu, Lingshan; +Cc: kvm, netdev, virtualization, xieyongji, gautam.dawar

On Wed, Aug 17, 2022 at 10:11:36AM +0800, Zhu, Lingshan wrote:
> 
> 
> On 8/17/2022 6:48 AM, Si-Wei Liu wrote:
> 
> 
> 
>     On 8/16/2022 1:29 AM, Zhu, Lingshan wrote:
> 
> 
> 
>         On 8/16/2022 3:41 PM, Si-Wei Liu wrote:
> 
>             Hi Michael,
> 
>             I just noticed this patch got pulled to linux-next prematurely
>             without getting consensus on code review, am not sure why. Hope it
>             was just an oversight.
> 
>             Unfortunately this introduced functionality regression to at least
>             two cases so far as I see:
> 
>             1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently
>             exposed and displayed in "vdpa dev config show" before feature
>             negotiation is done. Noted the corresponding features name shown in
>             vdpa tool is called "negotiated_features" rather than
>             "driver_features". I see in no way the intended change of the patch
>             should break this user level expectation regardless of any spec
>             requirement. Do you agree on this point?
> 
>         I will post a patch for iptour2, doing:
>         1) if iprout2 does not get driver_features from the kernel, then don't
>         show negotiated features in the command output
> 
>     This won't work as the vdpa userspace tool won't know *when* features are
>     negotiated. There's no guarantee in the kernel to assume 0 will be returned
>     from vendor driver during negotiation. On the other hand, with the supposed
>     change, userspace can't tell if there's really none of features negotiated,
>     or the feature negotiation is over. Before the change the userspace either
>     gets all the attributes when feature negotiation is over, or it gets
>     nothing when it's ongoing, so there was a distinction.This expectation of
>     what "negotiated_features" represents is established from day one, I see no
>     reason the intended kernel change to show other attributes should break
>     userspace behavior and user's expectation.
> 
> User space can only read valid *driver_features* after the features negotiation
> is done, *device_features* does not require the negotiation.
> 
> If you want to prevent random values read from driver_features, here I propose
> a fix: only read driver_features when the negotiation is done, this means to
> check (status & VIRTIO_CONFIG_S_FEATURES_OK) before reading the
> driver_features.
> Sounds good?
> 
> @MST, if this is OK, I can include this change in my next version patch series.
> 
> Thanks,
> Zhu Lingshan

Sorry I don't get it. Is there going to be a new version? Do you want me
to revert this one and then apply a new one? It's ok if yes.


>         2) process and decoding the device features.
> 
> 
>             2. There was also another implicit assumption that is broken by
>             this patch. There could be a vdpa tool query of config via
>             vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races
>             with the first vdpa_set_features() call from VMM e.g. QEMU. Since
>             the S_FEATURES_OK blocking condition is removed, if the vdpa tool
>             query occurs earlier than the first set_driver_features() call from
>             VMM, the following code will treat the guest as legacy and then
>             trigger an erroneous vdpa_set_features_unlocked(... , 0) call to
>             the vdpa driver:
> 
>              374         /*
>              375          * Config accesses aren't supposed to trigger before
>             features are set.
>              376          * If it does happen we assume a legacy guest.
>              377          */
>              378         if (!vdev->features_valid)
>              379                 vdpa_set_features_unlocked(vdev, 0);
>              380         ops->get_config(vdev, offset, buf, len);
> 
>             Depending on vendor driver's implementation, L380 may either return
>             invalid config data (or invalid endianness if on BE) or only config
>             fields that are valid in legacy layout. What's more severe is that,
>             vdpa tool query in theory shouldn't affect feature negotiation at
>             all by making confusing calls to the device, but now it is possible
>             with the patch. Fixing this would require more delicate work on the
>             other paths involving the cf_lock reader/write semaphore.
> 
>             Not sure what you plan to do next, post the fixes for both issues
>             and get the community review? Or simply revert the patch in
>             question? Let us know.
> 
>         The spec says:
>         The device MUST allow reading of any device-specific configuration
>         field before FEATURES_OK is set by
>         the driver. This includes fields which are conditional on feature bits,
>         as long as those feature bits are offered
>         by the device.
> 
>         so whether FEATURES_OK should not block reading the device config
>         space. vdpa_get_config_unlocked() will read the features, I don't know
>         why it has a comment:
>                 /*
>                  * Config accesses aren't supposed to trigger before features
>         are set.
>                  * If it does happen we assume a legacy guest.
>                  */
> 
>         This conflicts with the spec.
> 
>         vdpa_get_config_unlocked() checks vdev->features_valid, if not valid,
>         it will set the drivers_features 0, I think this intends to prevent
>         reading random driver_features. This function does not hold any locks,
>         and didn't change anything.
> 
>         So what is the race?
>    
>     You'll see the race if you keep 'vdpa dev config show ...' running in a
>     tight loop while launching a VM with the vDPA device under query.
> 
>     -Siwei
> 
> 
> 
>        
>         Thanks
> 
>        
> 
>             Thanks,
>             -Siwei
> 
> 
>             On 8/12/2022 3:44 AM, Zhu Lingshan wrote:
> 
>                 Users may want to query the config space of a vDPA device,
>                 to choose a appropriate one for a certain guest. This means the
>                 users need to read the config space before FEATURES_OK, and
>                 the existence of config space contents does not depend on
>                 FEATURES_OK.
> 
>                 The spec says:
>                 The device MUST allow reading of any device-specific
>                 configuration
>                 field before FEATURES_OK is set by the driver. This includes
>                 fields which are conditional on feature bits, as long as those
>                 feature bits are offered by the device.
> 
>                 Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>                 ---
>                   drivers/vdpa/vdpa.c | 8 --------
>                   1 file changed, 8 deletions(-)
> 
>                 diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>                 index 6eb3d972d802..bf312d9c59ab 100644
>                 --- a/drivers/vdpa/vdpa.c
>                 +++ b/drivers/vdpa/vdpa.c
>                 @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct vdpa_device
>                 *vdev, struct sk_buff *msg, u32 portid,
>                   {
>                       u32 device_id;
>                       void *hdr;
>                 -    u8 status;
>                       int err;
>                         down_read(&vdev->cf_lock);
>                 -    status = vdev->config->get_status(vdev);
>                 -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>                 -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>                 completed");
>                 -        err = -EAGAIN;
>                 -        goto out;
>                 -    }
>                 -
>                       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family,
>                 flags,
>                                 VDPA_CMD_DEV_CONFIG_GET);
>                       if (!hdr) {
> 
> 
> 
> 
> 
> 
> 
> 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-17  6:14             ` Michael S. Tsirkin
  (?)
@ 2022-08-17  6:23             ` Zhu, Lingshan
  2022-08-17 18:48               ` Si-Wei Liu
  -1 siblings, 1 reply; 30+ messages in thread
From: Zhu, Lingshan @ 2022-08-17  6:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Si-Wei Liu, virtualization, netdev, kvm, parav, xieyongji,
	gautam.dawar, jasowang



On 8/17/2022 2:14 PM, Michael S. Tsirkin wrote:
> On Wed, Aug 17, 2022 at 10:11:36AM +0800, Zhu, Lingshan wrote:
>>
>> On 8/17/2022 6:48 AM, Si-Wei Liu wrote:
>>
>>
>>
>>      On 8/16/2022 1:29 AM, Zhu, Lingshan wrote:
>>
>>
>>
>>          On 8/16/2022 3:41 PM, Si-Wei Liu wrote:
>>
>>              Hi Michael,
>>
>>              I just noticed this patch got pulled to linux-next prematurely
>>              without getting consensus on code review, am not sure why. Hope it
>>              was just an oversight.
>>
>>              Unfortunately this introduced functionality regression to at least
>>              two cases so far as I see:
>>
>>              1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are inadvertently
>>              exposed and displayed in "vdpa dev config show" before feature
>>              negotiation is done. Noted the corresponding features name shown in
>>              vdpa tool is called "negotiated_features" rather than
>>              "driver_features". I see in no way the intended change of the patch
>>              should break this user level expectation regardless of any spec
>>              requirement. Do you agree on this point?
>>
>>          I will post a patch for iptour2, doing:
>>          1) if iprout2 does not get driver_features from the kernel, then don't
>>          show negotiated features in the command output
>>
>>      This won't work as the vdpa userspace tool won't know *when* features are
>>      negotiated. There's no guarantee in the kernel to assume 0 will be returned
>>      from vendor driver during negotiation. On the other hand, with the supposed
>>      change, userspace can't tell if there's really none of features negotiated,
>>      or the feature negotiation is over. Before the change the userspace either
>>      gets all the attributes when feature negotiation is over, or it gets
>>      nothing when it's ongoing, so there was a distinction.This expectation of
>>      what "negotiated_features" represents is established from day one, I see no
>>      reason the intended kernel change to show other attributes should break
>>      userspace behavior and user's expectation.
>>
>> User space can only read valid *driver_features* after the features negotiation
>> is done, *device_features* does not require the negotiation.
>>
>> If you want to prevent random values read from driver_features, here I propose
>> a fix: only read driver_features when the negotiation is done, this means to
>> check (status & VIRTIO_CONFIG_S_FEATURES_OK) before reading the
>> driver_features.
>> Sounds good?
>>
>> @MST, if this is OK, I can include this change in my next version patch series.
>>
>> Thanks,
>> Zhu Lingshan
> Sorry I don't get it. Is there going to be a new version? Do you want me
> to revert this one and then apply a new one? It's ok if yes.
Not a new version, it is a new patch, though I still didn't get the race 
condition, but I believe it
is reasonable to block reading the *driver_features* before FEATURES_OK.

So, I added code to check whether _FEATURES_OK is set:

  861         /* only read driver features after the feature negotiation 
is done */
  862         status = vdev->config->get_status(vdev);
  863         if (status & VIRTIO_CONFIG_S_FEATURES_OK) {
  864                 features_driver = 
vdev->config->get_driver_features(vdev);
  865                 if (nla_put_u64_64bit(msg, 
VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features_driver,
  866                                       VDPA_ATTR_PAD))
  867                 return -EMSGSIZE;
  868         }

If this solution looks good, I will add this patch in my V2 series.

Thanks
Zhu Lingshan

>
>
>>          2) process and decoding the device features.
>>
>>
>>              2. There was also another implicit assumption that is broken by
>>              this patch. There could be a vdpa tool query of config via
>>              vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races
>>              with the first vdpa_set_features() call from VMM e.g. QEMU. Since
>>              the S_FEATURES_OK blocking condition is removed, if the vdpa tool
>>              query occurs earlier than the first set_driver_features() call from
>>              VMM, the following code will treat the guest as legacy and then
>>              trigger an erroneous vdpa_set_features_unlocked(... , 0) call to
>>              the vdpa driver:
>>
>>               374         /*
>>               375          * Config accesses aren't supposed to trigger before
>>              features are set.
>>               376          * If it does happen we assume a legacy guest.
>>               377          */
>>               378         if (!vdev->features_valid)
>>               379                 vdpa_set_features_unlocked(vdev, 0);
>>               380         ops->get_config(vdev, offset, buf, len);
>>
>>              Depending on vendor driver's implementation, L380 may either return
>>              invalid config data (or invalid endianness if on BE) or only config
>>              fields that are valid in legacy layout. What's more severe is that,
>>              vdpa tool query in theory shouldn't affect feature negotiation at
>>              all by making confusing calls to the device, but now it is possible
>>              with the patch. Fixing this would require more delicate work on the
>>              other paths involving the cf_lock reader/write semaphore.
>>
>>              Not sure what you plan to do next, post the fixes for both issues
>>              and get the community review? Or simply revert the patch in
>>              question? Let us know.
>>
>>          The spec says:
>>          The device MUST allow reading of any device-specific configuration
>>          field before FEATURES_OK is set by
>>          the driver. This includes fields which are conditional on feature bits,
>>          as long as those feature bits are offered
>>          by the device.
>>
>>          so whether FEATURES_OK should not block reading the device config
>>          space. vdpa_get_config_unlocked() will read the features, I don't know
>>          why it has a comment:
>>                  /*
>>                   * Config accesses aren't supposed to trigger before features
>>          are set.
>>                   * If it does happen we assume a legacy guest.
>>                   */
>>
>>          This conflicts with the spec.
>>
>>          vdpa_get_config_unlocked() checks vdev->features_valid, if not valid,
>>          it will set the drivers_features 0, I think this intends to prevent
>>          reading random driver_features. This function does not hold any locks,
>>          and didn't change anything.
>>
>>          So what is the race?
>>     
>>      You'll see the race if you keep 'vdpa dev config show ...' running in a
>>      tight loop while launching a VM with the vDPA device under query.
>>
>>      -Siwei
>>
>>
>>
>>         
>>          Thanks
>>
>>         
>>
>>              Thanks,
>>              -Siwei
>>
>>
>>              On 8/12/2022 3:44 AM, Zhu Lingshan wrote:
>>
>>                  Users may want to query the config space of a vDPA device,
>>                  to choose a appropriate one for a certain guest. This means the
>>                  users need to read the config space before FEATURES_OK, and
>>                  the existence of config space contents does not depend on
>>                  FEATURES_OK.
>>
>>                  The spec says:
>>                  The device MUST allow reading of any device-specific
>>                  configuration
>>                  field before FEATURES_OK is set by the driver. This includes
>>                  fields which are conditional on feature bits, as long as those
>>                  feature bits are offered by the device.
>>
>>                  Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>                  ---
>>                    drivers/vdpa/vdpa.c | 8 --------
>>                    1 file changed, 8 deletions(-)
>>
>>                  diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>>                  index 6eb3d972d802..bf312d9c59ab 100644
>>                  --- a/drivers/vdpa/vdpa.c
>>                  +++ b/drivers/vdpa/vdpa.c
>>                  @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct vdpa_device
>>                  *vdev, struct sk_buff *msg, u32 portid,
>>                    {
>>                        u32 device_id;
>>                        void *hdr;
>>                  -    u8 status;
>>                        int err;
>>                          down_read(&vdev->cf_lock);
>>                  -    status = vdev->config->get_status(vdev);
>>                  -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>                  -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>                  completed");
>>                  -        err = -EAGAIN;
>>                  -        goto out;
>>                  -    }
>>                  -
>>                        hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family,
>>                  flags,
>>                                  VDPA_CMD_DEV_CONFIG_GET);
>>                        if (!hdr) {
>>
>>
>>
>>
>>
>>
>>
>>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
  2022-08-17  6:23             ` Zhu, Lingshan
@ 2022-08-17 18:48               ` Si-Wei Liu
       [not found]                 ` <b2430d3b-4ca2-a19c-da39-da91da732c02@intel.com>
  0 siblings, 1 reply; 30+ messages in thread
From: Si-Wei Liu @ 2022-08-17 18:48 UTC (permalink / raw)
  To: Zhu, Lingshan, Michael S. Tsirkin
  Cc: kvm, netdev, virtualization, xieyongji, gautam.dawar


[-- Attachment #1.1: Type: text/plain, Size: 11917 bytes --]



On 8/16/2022 11:23 PM, Zhu, Lingshan wrote:
>
>
> On 8/17/2022 2:14 PM, Michael S. Tsirkin wrote:
>> On Wed, Aug 17, 2022 at 10:11:36AM +0800, Zhu, Lingshan wrote:
>>>
>>> On 8/17/2022 6:48 AM, Si-Wei Liu wrote:
>>>
>>>
>>>
>>>      On 8/16/2022 1:29 AM, Zhu, Lingshan wrote:
>>>
>>>
>>>
>>>          On 8/16/2022 3:41 PM, Si-Wei Liu wrote:
>>>
>>>              Hi Michael,
>>>
>>>              I just noticed this patch got pulled to linux-next 
>>> prematurely
>>>              without getting consensus on code review, am not sure 
>>> why. Hope it
>>>              was just an oversight.
>>>
>>>              Unfortunately this introduced functionality regression 
>>> to at least
>>>              two cases so far as I see:
>>>
>>>              1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are 
>>> inadvertently
>>>              exposed and displayed in "vdpa dev config show" before 
>>> feature
>>>              negotiation is done. Noted the corresponding features 
>>> name shown in
>>>              vdpa tool is called "negotiated_features" rather than
>>>              "driver_features". I see in no way the intended change 
>>> of the patch
>>>              should break this user level expectation regardless of 
>>> any spec
>>>              requirement. Do you agree on this point?
>>>
>>>          I will post a patch for iptour2, doing:
>>>          1) if iprout2 does not get driver_features from the kernel, 
>>> then don't
>>>          show negotiated features in the command output
>>>
>>>      This won't work as the vdpa userspace tool won't know *when* 
>>> features are
>>>      negotiated. There's no guarantee in the kernel to assume 0 will 
>>> be returned
>>>      from vendor driver during negotiation. On the other hand, with 
>>> the supposed
>>>      change, userspace can't tell if there's really none of features 
>>> negotiated,
>>>      or the feature negotiation is over. Before the change the 
>>> userspace either
>>>      gets all the attributes when feature negotiation is over, or it 
>>> gets
>>>      nothing when it's ongoing, so there was a distinction.This 
>>> expectation of
>>>      what "negotiated_features" represents is established from day 
>>> one, I see no
>>>      reason the intended kernel change to show other attributes 
>>> should break
>>>      userspace behavior and user's expectation.
>>>
>>> User space can only read valid *driver_features* after the features 
>>> negotiation
>>> is done, *device_features* does not require the negotiation.
>>>
>>> If you want to prevent random values read from driver_features, here 
>>> I propose
>>> a fix: only read driver_features when the negotiation is done, this 
>>> means to
>>> check (status & VIRTIO_CONFIG_S_FEATURES_OK) before reading the
>>> driver_features.
>>> Sounds good?
>>>
>>> @MST, if this is OK, I can include this change in my next version 
>>> patch series.
>>>
>>> Thanks,
>>> Zhu Lingshan
>> Sorry I don't get it. Is there going to be a new version? Do you want me
>> to revert this one and then apply a new one? It's ok if yes.
> Not a new version, it is a new patch, though I still didn't get the 
> race condition, but I believe it
> is reasonable to block reading the *driver_features* before FEATURES_OK.
>
> So, I added code to check whether _FEATURES_OK is set:
>
>  861         /* only read driver features after the feature 
> negotiation is done */
>  862         status = vdev->config->get_status(vdev);
>  863         if (status & VIRTIO_CONFIG_S_FEATURES_OK) {
>  864                 features_driver = 
> vdev->config->get_driver_features(vdev);
>  865                 if (nla_put_u64_64bit(msg, 
> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features_driver,
>  866                                       VDPA_ATTR_PAD))
>  867                 return -EMSGSIZE;
>  868         }
>
> If this solution looks good, I will add this patch in my V2 series.
This solves the 1st issue in my report, but without a fix for the 2nd 
issue ( vdpa_dev_net_config_fill and vdpa_set_features race) addressed I 
don't think it covers all incurred regressions.

I've replied the way to reproduce the race. For me it's very obvious to 
see in my setup.
>
> > So what is the race?
> You'll see the race if you keep 'vdpa dev config show ...' running in 
> a tight loop while launching a VM with the vDPA device under query.
>
-Siwei

>
> Thanks
> Zhu Lingshan
>
>>
>>
>>>          2) process and decoding the device features.
>>>
>>>
>>>              2. There was also another implicit assumption that is 
>>> broken by
>>>              this patch. There could be a vdpa tool query of config via
>>> vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races
>>>              with the first vdpa_set_features() call from VMM e.g. 
>>> QEMU. Since
>>>              the S_FEATURES_OK blocking condition is removed, if the 
>>> vdpa tool
>>>              query occurs earlier than the first 
>>> set_driver_features() call from
>>>              VMM, the following code will treat the guest as legacy 
>>> and then
>>>              trigger an erroneous vdpa_set_features_unlocked(... , 
>>> 0) call to
>>>              the vdpa driver:
>>>
>>>               374         /*
>>>               375          * Config accesses aren't supposed to 
>>> trigger before
>>>              features are set.
>>>               376          * If it does happen we assume a legacy 
>>> guest.
>>>               377          */
>>>               378         if (!vdev->features_valid)
>>>               379 vdpa_set_features_unlocked(vdev, 0);
>>>               380         ops->get_config(vdev, offset, buf, len);
>>>
>>>              Depending on vendor driver's implementation, L380 may 
>>> either return
>>>              invalid config data (or invalid endianness if on BE) or 
>>> only config
>>>              fields that are valid in legacy layout. What's more 
>>> severe is that,
>>>              vdpa tool query in theory shouldn't affect feature 
>>> negotiation at
>>>              all by making confusing calls to the device, but now it 
>>> is possible
>>>              with the patch. Fixing this would require more delicate 
>>> work on the
>>>              other paths involving the cf_lock reader/write semaphore.
>>>
>>>              Not sure what you plan to do next, post the fixes for 
>>> both issues
>>>              and get the community review? Or simply revert the 
>>> patch in
>>>              question? Let us know.
>>>
>>>          The spec says:
>>>          The device MUST allow reading of any device-specific 
>>> configuration
>>>          field before FEATURES_OK is set by
>>>          the driver. This includes fields which are conditional on 
>>> feature bits,
>>>          as long as those feature bits are offered
>>>          by the device.
>>>
>>>          so whether FEATURES_OK should not block reading the device 
>>> config
>>>          space. vdpa_get_config_unlocked() will read the features, I 
>>> don't know
>>>          why it has a comment:
>>>                  /*
>>>                   * Config accesses aren't supposed to trigger 
>>> before features
>>>          are set.
>>>                   * If it does happen we assume a legacy guest.
>>>                   */
>>>
>>>          This conflicts with the spec.
>>>
>>>          vdpa_get_config_unlocked() checks vdev->features_valid, if 
>>> not valid,
>>>          it will set the drivers_features 0, I think this intends to 
>>> prevent
>>>          reading random driver_features. This function does not hold 
>>> any locks,
>>>          and didn't change anything.
>>>
>>>          So what is the race?
>>>          You'll see the race if you keep 'vdpa dev config show ...' 
>>> running in a
>>>      tight loop while launching a VM with the vDPA device under query.
>>>
>>>      -Siwei
>>>
>>>
>>>
>>>                  Thanks
>>>
>>>
>>>              Thanks,
>>>              -Siwei
>>>
>>>
>>>              On 8/12/2022 3:44 AM, Zhu Lingshan wrote:
>>>
>>>                  Users may want to query the config space of a vDPA 
>>> device,
>>>                  to choose a appropriate one for a certain guest. 
>>> This means the
>>>                  users need to read the config space before 
>>> FEATURES_OK, and
>>>                  the existence of config space contents does not 
>>> depend on
>>>                  FEATURES_OK.
>>>
>>>                  The spec says:
>>>                  The device MUST allow reading of any device-specific
>>>                  configuration
>>>                  field before FEATURES_OK is set by the driver. This 
>>> includes
>>>                  fields which are conditional on feature bits, as 
>>> long as those
>>>                  feature bits are offered by the device.
>>>
>>>                  Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>                  ---
>>>                    drivers/vdpa/vdpa.c | 8 --------
>>>                    1 file changed, 8 deletions(-)
>>>
>>>                  diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>>>                  index 6eb3d972d802..bf312d9c59ab 100644
>>>                  --- a/drivers/vdpa/vdpa.c
>>>                  +++ b/drivers/vdpa/vdpa.c
>>>                  @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct 
>>> vdpa_device
>>>                  *vdev, struct sk_buff *msg, u32 portid,
>>>                    {
>>>                        u32 device_id;
>>>                        void *hdr;
>>>                  -    u8 status;
>>>                        int err;
>>>                          down_read(&vdev->cf_lock);
>>>                  -    status = vdev->config->get_status(vdev);
>>>                  -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>                  -        NL_SET_ERR_MSG_MOD(extack, "Features 
>>> negotiation not
>>>                  completed");
>>>                  -        err = -EAGAIN;
>>>                  -        goto out;
>>>                  -    }
>>>                  -
>>>                        hdr = genlmsg_put(msg, portid, seq, 
>>> &vdpa_nl_family,
>>>                  flags,
>>>                                  VDPA_CMD_DEV_CONFIG_GET);
>>>                        if (!hdr) {
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>

[-- Attachment #1.2: Type: text/html, Size: 24107 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
       [not found]                 ` <b2430d3b-4ca2-a19c-da39-da91da732c02@intel.com>
@ 2022-08-19  0:04                   ` Si-Wei Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Si-Wei Liu @ 2022-08-19  0:04 UTC (permalink / raw)
  To: Zhu, Lingshan, Michael S. Tsirkin
  Cc: kvm, netdev, virtualization, xieyongji, gautam.dawar


[-- Attachment #1.1: Type: text/plain, Size: 13164 bytes --]



On 8/17/2022 11:52 PM, Zhu, Lingshan wrote:
>
>
> On 8/18/2022 2:48 AM, Si-Wei Liu wrote:
>>
>>
>> On 8/16/2022 11:23 PM, Zhu, Lingshan wrote:
>>>
>>>
>>> On 8/17/2022 2:14 PM, Michael S. Tsirkin wrote:
>>>> On Wed, Aug 17, 2022 at 10:11:36AM +0800, Zhu, Lingshan wrote:
>>>>>
>>>>> On 8/17/2022 6:48 AM, Si-Wei Liu wrote:
>>>>>
>>>>>
>>>>>
>>>>>      On 8/16/2022 1:29 AM, Zhu, Lingshan wrote:
>>>>>
>>>>>
>>>>>
>>>>>          On 8/16/2022 3:41 PM, Si-Wei Liu wrote:
>>>>>
>>>>>              Hi Michael,
>>>>>
>>>>>              I just noticed this patch got pulled to linux-next 
>>>>> prematurely
>>>>>              without getting consensus on code review, am not sure 
>>>>> why. Hope it
>>>>>              was just an oversight.
>>>>>
>>>>>              Unfortunately this introduced functionality 
>>>>> regression to at least
>>>>>              two cases so far as I see:
>>>>>
>>>>>              1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are 
>>>>> inadvertently
>>>>>              exposed and displayed in "vdpa dev config show" 
>>>>> before feature
>>>>>              negotiation is done. Noted the corresponding features 
>>>>> name shown in
>>>>>              vdpa tool is called "negotiated_features" rather than
>>>>>              "driver_features". I see in no way the intended 
>>>>> change of the patch
>>>>>              should break this user level expectation regardless 
>>>>> of any spec
>>>>>              requirement. Do you agree on this point?
>>>>>
>>>>>          I will post a patch for iptour2, doing:
>>>>>          1) if iprout2 does not get driver_features from the 
>>>>> kernel, then don't
>>>>>          show negotiated features in the command output
>>>>>
>>>>>      This won't work as the vdpa userspace tool won't know *when* 
>>>>> features are
>>>>>      negotiated. There's no guarantee in the kernel to assume 0 
>>>>> will be returned
>>>>>      from vendor driver during negotiation. On the other hand, 
>>>>> with the supposed
>>>>>      change, userspace can't tell if there's really none of 
>>>>> features negotiated,
>>>>>      or the feature negotiation is over. Before the change the 
>>>>> userspace either
>>>>>      gets all the attributes when feature negotiation is over, or 
>>>>> it gets
>>>>>      nothing when it's ongoing, so there was a distinction.This 
>>>>> expectation of
>>>>>      what "negotiated_features" represents is established from day 
>>>>> one, I see no
>>>>>      reason the intended kernel change to show other attributes 
>>>>> should break
>>>>>      userspace behavior and user's expectation.
>>>>>
>>>>> User space can only read valid *driver_features* after the 
>>>>> features negotiation
>>>>> is done, *device_features* does not require the negotiation.
>>>>>
>>>>> If you want to prevent random values read from driver_features, 
>>>>> here I propose
>>>>> a fix: only read driver_features when the negotiation is done, 
>>>>> this means to
>>>>> check (status & VIRTIO_CONFIG_S_FEATURES_OK) before reading the
>>>>> driver_features.
>>>>> Sounds good?
>>>>>
>>>>> @MST, if this is OK, I can include this change in my next version 
>>>>> patch series.
>>>>>
>>>>> Thanks,
>>>>> Zhu Lingshan
>>>> Sorry I don't get it. Is there going to be a new version? Do you 
>>>> want me
>>>> to revert this one and then apply a new one? It's ok if yes.
>>> Not a new version, it is a new patch, though I still didn't get the 
>>> race condition, but I believe it
>>> is reasonable to block reading the *driver_features* before 
>>> FEATURES_OK.
>>>
>>> So, I added code to check whether _FEATURES_OK is set:
>>>
>>>  861         /* only read driver features after the feature 
>>> negotiation is done */
>>>  862         status = vdev->config->get_status(vdev);
>>>  863         if (status & VIRTIO_CONFIG_S_FEATURES_OK) {
>>>  864                 features_driver = 
>>> vdev->config->get_driver_features(vdev);
>>>  865                 if (nla_put_u64_64bit(msg, 
>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features_driver,
>>>  866                                       VDPA_ATTR_PAD))
>>>  867                 return -EMSGSIZE;
>>>  868         }
>>>
>>> If this solution looks good, I will add this patch in my V2 series.
>> This solves the 1st issue in my report, but without a fix for the 2nd 
>> issue ( vdpa_dev_net_config_fill and vdpa_set_features race) 
>> addressed I don't think it covers all incurred regressions.
>>
>> I've replied the way to reproduce the race. For me it's very obvious 
>> to see in my setup.
> Though I still did not see any errors in my run, but I guess you mean 
> the race condition in set_features(), right?
Yes.
>
> The spec says: The device MUST allow reading of any device-specific 
> configuration field before FEATURES_OK is set by the driver.
>
> So there is no need to check whether driver_features is set in 
> vdpa_get_config_unlocked(). We should remove the code checks
> feature_valid and the code set_features to zero. This conflicts with 
> the spec. And so no race conditions.
I don't disagree with the removal, you can try once more. This proposal 
had been attempted but rejected.

-Siwei



>
> Thanks,
> Zhu Lingshan
>>>
>>> > So what is the race?
>>> You'll see the race if you keep 'vdpa dev config show ...' running 
>>> in a tight loop while launching a VM with the vDPA device under query.
>>>
>> -Siwei
>>
>>>
>>> Thanks
>>> Zhu Lingshan
>>>
>>>>
>>>>
>>>>>          2) process and decoding the device features.
>>>>>
>>>>>
>>>>>              2. There was also another implicit assumption that is 
>>>>> broken by
>>>>>              this patch. There could be a vdpa tool query of 
>>>>> config via
>>>>> vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races
>>>>>              with the first vdpa_set_features() call from VMM e.g. 
>>>>> QEMU. Since
>>>>>              the S_FEATURES_OK blocking condition is removed, if 
>>>>> the vdpa tool
>>>>>              query occurs earlier than the first 
>>>>> set_driver_features() call from
>>>>>              VMM, the following code will treat the guest as 
>>>>> legacy and then
>>>>>              trigger an erroneous vdpa_set_features_unlocked(... , 
>>>>> 0) call to
>>>>>              the vdpa driver:
>>>>>
>>>>>               374         /*
>>>>>               375          * Config accesses aren't supposed to 
>>>>> trigger before
>>>>>              features are set.
>>>>>               376          * If it does happen we assume a legacy 
>>>>> guest.
>>>>>               377          */
>>>>>               378         if (!vdev->features_valid)
>>>>>               379 vdpa_set_features_unlocked(vdev, 0);
>>>>>               380         ops->get_config(vdev, offset, buf, len);
>>>>>
>>>>>              Depending on vendor driver's implementation, L380 may 
>>>>> either return
>>>>>              invalid config data (or invalid endianness if on BE) 
>>>>> or only config
>>>>>              fields that are valid in legacy layout. What's more 
>>>>> severe is that,
>>>>>              vdpa tool query in theory shouldn't affect feature 
>>>>> negotiation at
>>>>>              all by making confusing calls to the device, but now 
>>>>> it is possible
>>>>>              with the patch. Fixing this would require more 
>>>>> delicate work on the
>>>>>              other paths involving the cf_lock reader/write 
>>>>> semaphore.
>>>>>
>>>>>              Not sure what you plan to do next, post the fixes for 
>>>>> both issues
>>>>>              and get the community review? Or simply revert the 
>>>>> patch in
>>>>>              question? Let us know.
>>>>>
>>>>>          The spec says:
>>>>>          The device MUST allow reading of any device-specific 
>>>>> configuration
>>>>>          field before FEATURES_OK is set by
>>>>>          the driver. This includes fields which are conditional on 
>>>>> feature bits,
>>>>>          as long as those feature bits are offered
>>>>>          by the device.
>>>>>
>>>>>          so whether FEATURES_OK should not block reading the 
>>>>> device config
>>>>>          space. vdpa_get_config_unlocked() will read the features, 
>>>>> I don't know
>>>>>          why it has a comment:
>>>>>                  /*
>>>>>                   * Config accesses aren't supposed to trigger 
>>>>> before features
>>>>>          are set.
>>>>>                   * If it does happen we assume a legacy guest.
>>>>>                   */
>>>>>
>>>>>          This conflicts with the spec.
>>>>>
>>>>>          vdpa_get_config_unlocked() checks vdev->features_valid, 
>>>>> if not valid,
>>>>>          it will set the drivers_features 0, I think this intends 
>>>>> to prevent
>>>>>          reading random driver_features. This function does not 
>>>>> hold any locks,
>>>>>          and didn't change anything.
>>>>>
>>>>>          So what is the race?
>>>>>          You'll see the race if you keep 'vdpa dev config show 
>>>>> ...' running in a
>>>>>      tight loop while launching a VM with the vDPA device under 
>>>>> query.
>>>>>
>>>>>      -Siwei
>>>>>
>>>>>
>>>>>
>>>>>                  Thanks
>>>>>
>>>>>
>>>>>              Thanks,
>>>>>              -Siwei
>>>>>
>>>>>
>>>>>              On 8/12/2022 3:44 AM, Zhu Lingshan wrote:
>>>>>
>>>>>                  Users may want to query the config space of a 
>>>>> vDPA device,
>>>>>                  to choose a appropriate one for a certain guest. 
>>>>> This means the
>>>>>                  users need to read the config space before 
>>>>> FEATURES_OK, and
>>>>>                  the existence of config space contents does not 
>>>>> depend on
>>>>>                  FEATURES_OK.
>>>>>
>>>>>                  The spec says:
>>>>>                  The device MUST allow reading of any device-specific
>>>>>                  configuration
>>>>>                  field before FEATURES_OK is set by the driver. 
>>>>> This includes
>>>>>                  fields which are conditional on feature bits, as 
>>>>> long as those
>>>>>                  feature bits are offered by the device.
>>>>>
>>>>>                  Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>                  ---
>>>>>                    drivers/vdpa/vdpa.c | 8 --------
>>>>>                    1 file changed, 8 deletions(-)
>>>>>
>>>>>                  diff --git a/drivers/vdpa/vdpa.c 
>>>>> b/drivers/vdpa/vdpa.c
>>>>>                  index 6eb3d972d802..bf312d9c59ab 100644
>>>>>                  --- a/drivers/vdpa/vdpa.c
>>>>>                  +++ b/drivers/vdpa/vdpa.c
>>>>>                  @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct 
>>>>> vdpa_device
>>>>>                  *vdev, struct sk_buff *msg, u32 portid,
>>>>>                    {
>>>>>                        u32 device_id;
>>>>>                        void *hdr;
>>>>>                  -    u8 status;
>>>>>                        int err;
>>>>> down_read(&vdev->cf_lock);
>>>>>                  -    status = vdev->config->get_status(vdev);
>>>>>                  -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>                  -        NL_SET_ERR_MSG_MOD(extack, "Features 
>>>>> negotiation not
>>>>>                  completed");
>>>>>                  -        err = -EAGAIN;
>>>>>                  -        goto out;
>>>>>                  -    }
>>>>>                  -
>>>>>                        hdr = genlmsg_put(msg, portid, seq, 
>>>>> &vdpa_nl_family,
>>>>>                  flags,
>>>>> VDPA_CMD_DEV_CONFIG_GET);
>>>>>                        if (!hdr) {
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>
>

[-- Attachment #1.2: Type: text/html, Size: 28638 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2022-08-19  0:05 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-12 10:44 [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
2022-08-12 10:44 ` [PATCH V5 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation Zhu Lingshan
2022-08-12 10:44 ` [PATCH V5 2/6] vDPA/ifcvf: support userspace to query features and MQ of a management device Zhu Lingshan
2022-08-12 10:44 ` [PATCH V5 3/6] vDPA: allow userspace to query features of a vDPA device Zhu Lingshan
2022-08-12 10:44 ` [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space Zhu Lingshan
2022-08-16  7:41   ` Si-Wei Liu
2022-08-16  7:41     ` Si-Wei Liu
2022-08-16  8:23     ` Michael S. Tsirkin
2022-08-16  8:23       ` Michael S. Tsirkin
     [not found]     ` <f2864c96-cddd-129e-7dd8-a3743fe7e0d0@intel.com>
2022-08-16  8:41       ` Michael S. Tsirkin
2022-08-16  8:41         ` Michael S. Tsirkin
2022-08-16  8:46         ` Zhu, Lingshan
2022-08-16 22:48       ` Si-Wei Liu
     [not found]         ` <a488a17a-b716-52aa-cc31-2e51f8f972d2@intel.com>
2022-08-17  6:14           ` Michael S. Tsirkin
2022-08-17  6:14             ` Michael S. Tsirkin
2022-08-17  6:23             ` Zhu, Lingshan
2022-08-17 18:48               ` Si-Wei Liu
     [not found]                 ` <b2430d3b-4ca2-a19c-da39-da91da732c02@intel.com>
2022-08-19  0:04                   ` Si-Wei Liu
2022-08-16 21:13     ` Michael S. Tsirkin
2022-08-16 21:13       ` Michael S. Tsirkin
2022-08-16 22:09       ` Si-Wei Liu
2022-08-16 22:09         ` Si-Wei Liu
2022-08-12 10:44 ` [PATCH V5 5/6] vDPA: Conditionally read fields in virtio-net dev " Zhu Lingshan
2022-08-12 10:45 ` [PATCH V5 6/6] fix 'cast to restricted le16' warnings in vdpa.c Zhu Lingshan
2022-08-12 11:14 ` [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink Michael S. Tsirkin
2022-08-12 11:14   ` Michael S. Tsirkin
2022-08-12 11:17   ` Michael S. Tsirkin
2022-08-12 11:17     ` Michael S. Tsirkin
2022-08-12 11:41     ` Zhu, Lingshan
2022-08-15  9:36       ` Zhu, Lingshan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.