All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
@ 2021-06-16 19:11 Parav Pandit
  2021-06-16 19:11 ` [PATCH linux-next v3 1/6] vdpa: Introduce and use vdpa device get, set config helpers Parav Pandit
                   ` (6 more replies)
  0 siblings, 7 replies; 62+ messages in thread
From: Parav Pandit @ 2021-06-16 19:11 UTC (permalink / raw)
  To: virtualization; +Cc: elic, mst

Currently user cannot set the mac address and mtu of the vdpa device.
This patchset enables users to set the mac address and mtu of the vdpa
device once the device is created.
If a vendor driver supports such configuration user can set it otherwise
user gets unsupported error.

vdpa mac address and mtu are device configuration layout fields.
To keep interface generic enough for multiple types of vdpa devices, mac
address and mtu setting is implemented as configuration layout config
knobs.
This enables to use similar config layout for other virtio devices.

An example of query & set of config layout fields for vdpa_sim_net
driver:

Configuration layout fields are set after device is created.
This enables user to change such fields at later point without destroying and
recreating the device for new config.

$ vdpa mgmtdev show
vdpasim_net:
  supported_classes net

Add the device:
$ vdpa dev add name bar mgmtdev vdpasim_net

Configure mac address and mtu:
$ vdpa dev config set bar mac 00:11:22:33:44:55 mtu 9000

In above command only mac address or only mtu can also be set.

View the config after setting:
$ vdpa dev config show
bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000 speed 0 duplex 0

Patch summary:
Patch-1 introduced and use helpers for get/set config area
Patch-2 implement query device config layout
Patch-3 enanble user to set mac and mtu in config space
Patch-4 vdpa_sim_net implements get and set of config layout
Patch-5 mlx5 vdpa driver supports user provided mac config
Patch-6 mlx5 vdpa driver uses user provided mac during rx flow steering

changelog:
v2->v3:
 - dropped patches which are merged
 - simplified code to handle non transitional devices

v1->v2:
 - new patches to fix kdoc comment to add new kdoc section
 - new patch to have synchronized access to features and config space
 - read whole net config layout instead of individual fields
 - added error extack for unmanaged vdpa device
 - fixed several endianness issues
 - introduced vdpa device ops for get config which is synchronized
   with other get/set features ops and config ops
 - fixed mtu range checking for max
 - using NLA_POLICY_ETH_ADDR
 - set config moved to device ops instead of mgmtdev ops
 - merged build and set to single routine
 - ensuring that user has NET_ADMIN capability for configuring network
   attributes
 - using updated interface and callbacks for get/set config
 - following new api for config get/set for mgmt tool in mlx5 vdpa
   driver
 - fixes for accessing right SF dma device and bar address
 - fix for mtu calculation
 - fix for bit access in features
 - fix for index restore with suspend/resume operation


Eli Cohen (2):
  vdpa/mlx5: Support configuration of MAC
  vdpa/mlx5: Forward only packets with allowed MAC address

Parav Pandit (4):
  vdpa: Introduce and use vdpa device get, set config helpers
  vdpa: Introduce query of device config layout
  vdpa: Enable user to set mac and mtu of vdpa device
  vdpa_sim_net: Enable user to set mac address and mtu

 drivers/vdpa/mlx5/net/mlx5_vnet.c    | 101 ++++++--
 drivers/vdpa/vdpa.c                  | 337 +++++++++++++++++++++++++++
 drivers/vdpa/vdpa_sim/vdpa_sim.c     |  13 ++
 drivers/vdpa/vdpa_sim/vdpa_sim.h     |   2 +
 drivers/vdpa/vdpa_sim/vdpa_sim_net.c |  34 +--
 drivers/vhost/vdpa.c                 |   3 +-
 include/linux/vdpa.h                 |  38 +--
 include/uapi/linux/vdpa.h            |  12 +
 8 files changed, 490 insertions(+), 50 deletions(-)

-- 
2.26.2

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH linux-next v3 1/6] vdpa: Introduce and use vdpa device get, set config helpers
  2021-06-16 19:11 [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu Parav Pandit
@ 2021-06-16 19:11 ` Parav Pandit
  2021-06-22  7:08   ` Jason Wang
  2021-06-16 19:11 ` [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout Parav Pandit
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-06-16 19:11 UTC (permalink / raw)
  To: virtualization; +Cc: elic, mst

Subsequent patches enable get and set configuration either
via management device or via vdpa device' config ops.

This requires synchronization between multiple callers to get and set
config callbacks. Features setting also influence the layout of the
configuration fields endianness.

To avoid exposing synchronization primitives to callers, introduce
helper for setting the configuration and use it.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
---
changelog:
v1->v2
 - new patch to have synchronized access to features and config space
---
 drivers/vdpa/vdpa.c  | 36 ++++++++++++++++++++++++++++++++++++
 drivers/vhost/vdpa.c |  3 +--
 include/linux/vdpa.h | 18 ++++--------------
 3 files changed, 41 insertions(+), 16 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index bb3f1d1f0422..bc44cdc34114 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -284,6 +284,42 @@ void vdpa_mgmtdev_unregister(struct vdpa_mgmt_dev *mdev)
 }
 EXPORT_SYMBOL_GPL(vdpa_mgmtdev_unregister);
 
+/**
+ * vdpa_get_config - Get one or more device configuration fields.
+ * @vdev: vdpa device to operate on
+ * @offset: starting byte offset of the field
+ * @buf: buffer pointer to read to
+ * @len: length of the configuration fields in bytes
+ */
+void vdpa_get_config(struct vdpa_device *vdev, unsigned int offset,
+		     void *buf, unsigned int len)
+{
+	const struct vdpa_config_ops *ops = vdev->config;
+
+	/*
+	 * Config accesses aren't supposed to trigger before features are set.
+	 * If it does happen we assume a legacy guest.
+	 */
+	if (!vdev->features_valid)
+		vdpa_set_features(vdev, 0);
+	ops->get_config(vdev, offset, buf, len);
+}
+EXPORT_SYMBOL_GPL(vdpa_get_config);
+
+/**
+ * vdpa_set_config - Set one or more device configuration fields.
+ * @vdev: vdpa device to operate on
+ * @offset: starting byte offset of the field
+ * @buf: buffer pointer to read from
+ * @length: length of the configuration fields in bytes
+ */
+void vdpa_set_config(struct vdpa_device *vdev, unsigned int offset,
+		     void *buf, unsigned int length)
+{
+	vdev->config->set_config(vdev, offset, buf, length);
+}
+EXPORT_SYMBOL_GPL(vdpa_set_config);
+
 static bool mgmtdev_handle_match(const struct vdpa_mgmt_dev *mdev,
 				 const char *busname, const char *devname)
 {
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index fb41db3da611..908b4fb251b3 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -231,7 +231,6 @@ static long vhost_vdpa_set_config(struct vhost_vdpa *v,
 				  struct vhost_vdpa_config __user *c)
 {
 	struct vdpa_device *vdpa = v->vdpa;
-	const struct vdpa_config_ops *ops = vdpa->config;
 	struct vhost_vdpa_config config;
 	unsigned long size = offsetof(struct vhost_vdpa_config, buf);
 	u8 *buf;
@@ -245,7 +244,7 @@ static long vhost_vdpa_set_config(struct vhost_vdpa *v,
 	if (IS_ERR(buf))
 		return PTR_ERR(buf);
 
-	ops->set_config(vdpa, config.off, buf, config.len);
+	vdpa_set_config(vdpa, config.off, buf, config.len);
 
 	kvfree(buf);
 	return 0;
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index f311d227aa1b..993d99519452 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -332,20 +332,10 @@ static inline int vdpa_set_features(struct vdpa_device *vdev, u64 features)
         return ops->set_features(vdev, features);
 }
 
-
-static inline void vdpa_get_config(struct vdpa_device *vdev, unsigned offset,
-				   void *buf, unsigned int len)
-{
-        const struct vdpa_config_ops *ops = vdev->config;
-
-	/*
-	 * Config accesses aren't supposed to trigger before features are set.
-	 * If it does happen we assume a legacy guest.
-	 */
-	if (!vdev->features_valid)
-		vdpa_set_features(vdev, 0);
-	ops->get_config(vdev, offset, buf, len);
-}
+void vdpa_get_config(struct vdpa_device *vdev, unsigned int offset,
+		     void *buf, unsigned int len);
+void vdpa_set_config(struct vdpa_device *dev, unsigned int offset,
+		     void *buf, unsigned int length);
 
 /**
  * struct vdpa_mgmtdev_ops - vdpa device ops
-- 
2.26.2

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-16 19:11 [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu Parav Pandit
  2021-06-16 19:11 ` [PATCH linux-next v3 1/6] vdpa: Introduce and use vdpa device get, set config helpers Parav Pandit
@ 2021-06-16 19:11 ` Parav Pandit
  2021-06-22  7:20   ` Jason Wang
  2021-06-16 19:11 ` [PATCH linux-next v3 3/6] vdpa: Enable user to set mac and mtu of vdpa device Parav Pandit
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-06-16 19:11 UTC (permalink / raw)
  To: virtualization; +Cc: elic, mst

Introduce a command to query a device config layout.

An example query of network vdpa device:

$ vdpa dev add name bar mgmtdev vdpasim_net

$ vdpa dev config show
bar: mac 00:35:09:19:48:05 link up link_announce false mtu 1500 speed 0 duplex 0

$ vdpa dev config show -jp
{
    "config": {
        "bar": {
            "mac": "00:35:09:19:48:05",
            "link ": "up",
            "link_announce ": false,
            "mtu": 1500,
            "speed": 0,
            "duplex": 0
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
---
 drivers/vdpa/vdpa.c       | 212 ++++++++++++++++++++++++++++++++++++++
 include/linux/vdpa.h      |   2 +
 include/uapi/linux/vdpa.h |  11 ++
 3 files changed, 225 insertions(+)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index bc44cdc34114..1295528244c3 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -14,6 +14,8 @@
 #include <uapi/linux/vdpa.h>
 #include <net/genetlink.h>
 #include <linux/mod_devicetable.h>
+#include <linux/virtio_net.h>
+#include <linux/virtio_ids.h>
 
 static LIST_HEAD(mdev_head);
 /* A global mutex that protects vdpa management device and device level operations. */
@@ -60,6 +62,7 @@ static void vdpa_release_dev(struct device *d)
 		ops->free(vdev);
 
 	ida_simple_remove(&vdpa_index_ida, vdev->index);
+	mutex_destroy(&vdev->cf_mutex);
 	kfree(vdev);
 }
 
@@ -114,6 +117,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 	if (err)
 		goto err_name;
 
+	mutex_init(&vdev->cf_mutex);
 	device_initialize(&vdev->dev);
 
 	return vdev;
@@ -296,6 +300,7 @@ void vdpa_get_config(struct vdpa_device *vdev, unsigned int offset,
 {
 	const struct vdpa_config_ops *ops = vdev->config;
 
+	mutex_lock(&vdev->cf_mutex);
 	/*
 	 * Config accesses aren't supposed to trigger before features are set.
 	 * If it does happen we assume a legacy guest.
@@ -303,6 +308,7 @@ void vdpa_get_config(struct vdpa_device *vdev, unsigned int offset,
 	if (!vdev->features_valid)
 		vdpa_set_features(vdev, 0);
 	ops->get_config(vdev, offset, buf, len);
+	mutex_unlock(&vdev->cf_mutex);
 }
 EXPORT_SYMBOL_GPL(vdpa_get_config);
 
@@ -316,7 +322,9 @@ EXPORT_SYMBOL_GPL(vdpa_get_config);
 void vdpa_set_config(struct vdpa_device *vdev, unsigned int offset,
 		     void *buf, unsigned int length)
 {
+	mutex_lock(&vdev->cf_mutex);
 	vdev->config->set_config(vdev, offset, buf, length);
+	mutex_unlock(&vdev->cf_mutex);
 }
 EXPORT_SYMBOL_GPL(vdpa_set_config);
 
@@ -643,6 +651,204 @@ static int vdpa_nl_cmd_dev_get_dumpit(struct sk_buff *msg, struct netlink_callba
 	return msg->len;
 }
 
+static int vdpa_dev_net_mq_config_fill(struct vdpa_device *vdev,
+				       struct sk_buff *msg, u64 features,
+				       const struct virtio_net_config *config)
+{
+	u16 val_u16;
+
+	if ((features & (1ULL << VIRTIO_NET_F_MQ)) == 0)
+		return 0;
+
+	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
+	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, val_u16);
+}
+
+static int vdpa_dev_net_rss_config_fill(struct vdpa_device *vdev,
+					struct sk_buff *msg, u64 features,
+					const struct virtio_net_config *config)
+{
+	u16 val_u16;
+	u16 val_u32;
+
+	if ((features & (1ULL << VIRTIO_NET_F_RSS)) == 0)
+		return 0;
+
+	if (nla_put_u8(msg, VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,
+		       config->rss_max_key_size))
+		return -EMSGSIZE;
+
+	val_u16 = le16_to_cpu(config->rss_max_key_size);
+	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN, val_u16))
+		return -EMSGSIZE;
+
+	val_u32 = le32_to_cpu(config->supported_hash_types);
+	if (nla_put_u32(msg, VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES, val_u32))
+		return -EMSGSIZE;
+	return 0;
+}
+
+static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *msg)
+{
+	struct virtio_net_config config = {};
+	u64 features;
+	u32 val_u32;
+	u16 val_u16;
+	int err;
+
+	vdpa_get_config(vdev, 0, &config, sizeof(config));
+
+	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
+		    config.mac))
+		return -EMSGSIZE;
+
+	val_u16 = le16_to_cpu(config.status);
+	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
+		return -EMSGSIZE;
+
+	val_u16 = le16_to_cpu(config.mtu);
+	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
+		return -EMSGSIZE;
+
+	val_u32 = le32_to_cpu(config.speed);
+	if (nla_put_u32(msg, VDPA_ATTR_DEV_NET_CFG_SPEED, val_u32))
+		return -EMSGSIZE;
+
+	if (nla_put_u8(msg, VDPA_ATTR_DEV_NET_CFG_DUPLEX, config.duplex))
+		return -EMSGSIZE;
+
+	features = vdev->config->get_features(vdev);
+
+	err = vdpa_dev_net_mq_config_fill(vdev, msg, features, &config);
+	if (err)
+		return err;
+	return vdpa_dev_net_rss_config_fill(vdev, msg, features, &config);
+}
+
+static int
+vdpa_dev_config_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq,
+		     int flags, struct netlink_ext_ack *extack)
+{
+	u32 device_id;
+	void *hdr;
+	int err;
+
+	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
+			  VDPA_CMD_DEV_CONFIG_GET);
+	if (!hdr)
+		return -EMSGSIZE;
+
+	if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev))) {
+		err = -EMSGSIZE;
+		goto msg_err;
+	}
+
+	device_id = vdev->config->get_device_id(vdev);
+	if (nla_put_u32(msg, VDPA_ATTR_DEV_ID, device_id)) {
+		err = -EMSGSIZE;
+		goto msg_err;
+	}
+
+	switch (device_id) {
+	case VIRTIO_ID_NET:
+		err = vdpa_dev_net_config_fill(vdev, msg);
+		break;
+	default:
+		err = -EOPNOTSUPP;
+		break;
+	}
+	if (err)
+		goto msg_err;
+
+	genlmsg_end(msg, hdr);
+	return 0;
+
+msg_err:
+	genlmsg_cancel(msg, hdr);
+	return err;
+}
+
+static int vdpa_nl_cmd_dev_config_get_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_device *vdev;
+	struct sk_buff *msg;
+	const char *devname;
+	struct device *dev;
+	int err;
+
+	if (!info->attrs[VDPA_ATTR_DEV_NAME])
+		return -EINVAL;
+	devname = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	mutex_lock(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, devname, vdpa_name_match);
+	if (!dev) {
+		NL_SET_ERR_MSG_MOD(info->extack, "device not found");
+		err = -ENODEV;
+		goto dev_err;
+	}
+	vdev = container_of(dev, struct vdpa_device, dev);
+	if (!vdev->mdev) {
+		NL_SET_ERR_MSG_MOD(info->extack, "unmanaged vdpa device");
+		err = -EINVAL;
+		goto mdev_err;
+	}
+	err = vdpa_dev_config_fill(vdev, msg, info->snd_portid, info->snd_seq,
+				   0, info->extack);
+	if (!err)
+		err = genlmsg_reply(msg, info);
+
+mdev_err:
+	put_device(dev);
+dev_err:
+	mutex_unlock(&vdpa_dev_mutex);
+	if (err)
+		nlmsg_free(msg);
+	return err;
+}
+
+static int vdpa_dev_config_dump(struct device *dev, void *data)
+{
+	struct vdpa_device *vdev = container_of(dev, struct vdpa_device, dev);
+	struct vdpa_dev_dump_info *info = data;
+	int err;
+
+	if (!vdev->mdev)
+		return 0;
+	if (info->idx < info->start_idx) {
+		info->idx++;
+		return 0;
+	}
+	err = vdpa_dev_config_fill(vdev, info->msg, NETLINK_CB(info->cb->skb).portid,
+				   info->cb->nlh->nlmsg_seq, NLM_F_MULTI,
+				   info->cb->extack);
+	if (err)
+		return err;
+
+	info->idx++;
+	return 0;
+}
+
+static int
+vdpa_nl_cmd_dev_config_get_dumpit(struct sk_buff *msg, struct netlink_callback *cb)
+{
+	struct vdpa_dev_dump_info info;
+
+	info.msg = msg;
+	info.cb = cb;
+	info.start_idx = cb->args[0];
+	info.idx = 0;
+
+	mutex_lock(&vdpa_dev_mutex);
+	bus_for_each_dev(&vdpa_bus, NULL, &info, vdpa_dev_config_dump);
+	mutex_unlock(&vdpa_dev_mutex);
+	cb->args[0] = info.idx;
+	return msg->len;
+}
+
 static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX + 1] = {
 	[VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING },
@@ -674,6 +880,12 @@ static const struct genl_ops vdpa_nl_ops[] = {
 		.doit = vdpa_nl_cmd_dev_get_doit,
 		.dumpit = vdpa_nl_cmd_dev_get_dumpit,
 	},
+	{
+		.cmd = VDPA_CMD_DEV_CONFIG_GET,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_dev_config_get_doit,
+		.dumpit = vdpa_nl_cmd_dev_config_get_dumpit,
+	},
 };
 
 static struct genl_family vdpa_nl_family __ro_after_init = {
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 993d99519452..bf104f9f461a 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -42,6 +42,7 @@ struct vdpa_mgmt_dev;
  * @dev: underlying device
  * @dma_dev: the actual device that is performing DMA
  * @config: the configuration ops for this device.
+ * @cf_mutex: Protects get and set access to features and configuration layout.
  * @index: device index
  * @features_valid: were features initialized? for legacy guests
  * @nvqs: maximum number of supported virtqueues
@@ -52,6 +53,7 @@ struct vdpa_device {
 	struct device dev;
 	struct device *dma_dev;
 	const struct vdpa_config_ops *config;
+	struct mutex cf_mutex; /* Protects get/set config and features */
 	unsigned int index;
 	bool features_valid;
 	int nvqs;
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index 66a41e4ec163..5c31ecc3b956 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -17,6 +17,7 @@ enum vdpa_command {
 	VDPA_CMD_DEV_NEW,
 	VDPA_CMD_DEV_DEL,
 	VDPA_CMD_DEV_GET,		/* can dump */
+	VDPA_CMD_DEV_CONFIG_GET,	/* can dump */
 };
 
 enum vdpa_attr {
@@ -33,6 +34,16 @@ enum vdpa_attr {
 	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
 	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
 
+	VDPA_ATTR_DEV_NET_CFG_MACADDR,		/* binary */
+	VDPA_ATTR_DEV_NET_STATUS,		/* u8 */
+	VDPA_ATTR_DEV_NET_CFG_MAX_VQP,		/* u16 */
+	VDPA_ATTR_DEV_NET_CFG_MTU,		/* u16 */
+	VDPA_ATTR_DEV_NET_CFG_SPEED,		/* u16 */
+	VDPA_ATTR_DEV_NET_CFG_DUPLEX,		/* u16 */
+	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,	/* u8 */
+	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,	/* u16 */
+	VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,	/* u32 */
+
 	/* new attributes must be added above here */
 	VDPA_ATTR_MAX,
 };
-- 
2.26.2

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH linux-next v3 3/6] vdpa: Enable user to set mac and mtu of vdpa device
  2021-06-16 19:11 [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu Parav Pandit
  2021-06-16 19:11 ` [PATCH linux-next v3 1/6] vdpa: Introduce and use vdpa device get, set config helpers Parav Pandit
  2021-06-16 19:11 ` [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout Parav Pandit
@ 2021-06-16 19:11 ` Parav Pandit
  2021-06-22  7:43   ` Jason Wang
  2021-06-16 19:11 ` [PATCH linux-next v3 4/6] vdpa_sim_net: Enable user to set mac address and mtu Parav Pandit
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-06-16 19:11 UTC (permalink / raw)
  To: virtualization; +Cc: elic, mst

$ vdpa dev add name bar mgmtdev vdpasim_net

$ vdpa dev config set bar mac 00:11:22:33:44:55 mtu 9000

$ vdpa dev config show
bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000 speed 0 duplex 0

$ vdpa dev config show -jp
{
    "config": {
        "bar": {
            "mac": "00:11:22:33:44:55",
            "link ": "up",
            "link_announce ": false,
            "mtu": 9000,
            "speed": 0,
            "duplex": 0
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
---
changelog:
v2->v3:
 - using new setup_config callback to setup device params via mgmt tool
   to avoid mixing with existing set_config().
---
 drivers/vdpa/vdpa.c       | 91 ++++++++++++++++++++++++++++++++++++++-
 include/linux/vdpa.h      | 18 ++++++++
 include/uapi/linux/vdpa.h |  1 +
 3 files changed, 109 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 1295528244c3..40874bd92126 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -14,7 +14,6 @@
 #include <uapi/linux/vdpa.h>
 #include <net/genetlink.h>
 #include <linux/mod_devicetable.h>
-#include <linux/virtio_net.h>
 #include <linux/virtio_ids.h>
 
 static LIST_HEAD(mdev_head);
@@ -849,10 +848,94 @@ vdpa_nl_cmd_dev_config_get_dumpit(struct sk_buff *msg, struct netlink_callback *
 	return msg->len;
 }
 
+static int vdpa_dev_net_config_set(struct vdpa_device *vdev,
+				   struct sk_buff *skb, struct genl_info *info)
+{
+	struct nlattr **nl_attrs = info->attrs;
+	struct vdpa_dev_set_config config = {};
+	const u8 *macaddr;
+	int err;
+
+	if (!netlink_capable(skb, CAP_NET_ADMIN))
+		return -EPERM;
+
+	if (!vdev->config->setup_config)
+		return -EOPNOTSUPP;
+
+	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]) {
+		macaddr = nla_data(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]);
+		memcpy(config.net.mac, macaddr, sizeof(config.net.mac));
+		config.net_mask.mac_valid = true;
+	}
+	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]) {
+		config.net.mtu =
+			nla_get_u16(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]);
+		config.net_mask.mtu_valid = true;
+	}
+
+	mutex_lock(&vdev->cf_mutex);
+	err = vdev->config->setup_config(vdev, &config);
+	mutex_unlock(&vdev->cf_mutex);
+	return err;
+}
+
+static int vdpa_dev_config_set(struct vdpa_device *vdev, struct sk_buff *skb,
+			       struct genl_info *info)
+{
+	int err = -EOPNOTSUPP;
+	u32 device_id;
+
+	if (!vdev->mdev)
+		return -EOPNOTSUPP;
+
+	device_id = vdev->config->get_device_id(vdev);
+	switch (device_id) {
+	case VIRTIO_ID_NET:
+		err = vdpa_dev_net_config_set(vdev, skb, info);
+		break;
+	default:
+		break;
+	}
+	return err;
+}
+
+static int vdpa_nl_cmd_dev_config_set_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_device *vdev;
+	const char *devname;
+	struct device *dev;
+	int err;
+
+	if (!info->attrs[VDPA_ATTR_DEV_NAME])
+		return -EINVAL;
+	devname = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
+
+	mutex_lock(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, devname, vdpa_name_match);
+	if (!dev) {
+		mutex_unlock(&vdpa_dev_mutex);
+		NL_SET_ERR_MSG_MOD(info->extack, "device not found");
+		return -ENODEV;
+	}
+	vdev = container_of(dev, struct vdpa_device, dev);
+	if (!vdev->mdev) {
+		mutex_unlock(&vdpa_dev_mutex);
+		put_device(dev);
+		return -EINVAL;
+	}
+	err = vdpa_dev_config_set(vdev, skb, info);
+	put_device(dev);
+	mutex_unlock(&vdpa_dev_mutex);
+	return err;
+}
+
 static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX + 1] = {
 	[VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING },
 	[VDPA_ATTR_DEV_NAME] = { .type = NLA_STRING },
+	[VDPA_ATTR_DEV_NET_CFG_MACADDR] = NLA_POLICY_ETH_ADDR,
+	/* virtio spec 1.1 section 5.1.4.1 for valid MTU range */
+	[VDPA_ATTR_DEV_NET_CFG_MTU] = NLA_POLICY_MIN(NLA_U16, 68),
 };
 
 static const struct genl_ops vdpa_nl_ops[] = {
@@ -886,6 +969,12 @@ static const struct genl_ops vdpa_nl_ops[] = {
 		.doit = vdpa_nl_cmd_dev_config_get_doit,
 		.dumpit = vdpa_nl_cmd_dev_config_get_dumpit,
 	},
+	{
+		.cmd = VDPA_CMD_DEV_CONFIG_SET,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_dev_config_set_doit,
+		.flags = GENL_ADMIN_PERM,
+	},
 };
 
 static struct genl_family vdpa_nl_family __ro_after_init = {
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index bf104f9f461a..9b7238d5310e 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -6,6 +6,8 @@
 #include <linux/device.h>
 #include <linux/interrupt.h>
 #include <linux/vhost_iotlb.h>
+#include <linux/virtio_net.h>
+#include <linux/if_ether.h>
 
 /**
  * struct vdpa_calllback - vDPA callback definition.
@@ -70,6 +72,17 @@ struct vdpa_iova_range {
 	u64 last;
 };
 
+struct vdpa_dev_set_config {
+	struct {
+		u8 mac[ETH_ALEN];
+		u16 mtu;
+	} net;
+	struct {
+		u8 mac_valid: 1;
+		u8 mtu_valid: 1;
+	} net_mask;
+};
+
 /**
  * struct vdpa_config_ops - operations for configuring a vDPA device.
  * Note: vDPA device drivers are required to implement all of the
@@ -169,6 +182,9 @@ struct vdpa_iova_range {
  *				@buf: buffer used to write from
  *				@len: the length to write to
  *				configuration space
+ * @setup_config:		Setup configuration space
+ *				@vdev: vdpa device
+ *				#config: configuration to apply to device
  * @get_generation:		Get device config generation (optional)
  *				@vdev: vdpa device
  *				Returns u32: device generation
@@ -241,6 +257,8 @@ struct vdpa_config_ops {
 			   void *buf, unsigned int len);
 	void (*set_config)(struct vdpa_device *vdev, unsigned int offset,
 			   const void *buf, unsigned int len);
+	int (*setup_config)(struct vdpa_device *vdev,
+			    const struct vdpa_dev_set_config *config);
 	u32 (*get_generation)(struct vdpa_device *vdev);
 	struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
 
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index 5c31ecc3b956..ec349789b8d1 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -18,6 +18,7 @@ enum vdpa_command {
 	VDPA_CMD_DEV_DEL,
 	VDPA_CMD_DEV_GET,		/* can dump */
 	VDPA_CMD_DEV_CONFIG_GET,	/* can dump */
+	VDPA_CMD_DEV_CONFIG_SET,
 };
 
 enum vdpa_attr {
-- 
2.26.2

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH linux-next v3 4/6] vdpa_sim_net: Enable user to set mac address and mtu
  2021-06-16 19:11 [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu Parav Pandit
                   ` (2 preceding siblings ...)
  2021-06-16 19:11 ` [PATCH linux-next v3 3/6] vdpa: Enable user to set mac and mtu of vdpa device Parav Pandit
@ 2021-06-16 19:11 ` Parav Pandit
  2021-06-16 19:11 ` [PATCH linux-next v3 5/6] vdpa/mlx5: Support configuration of MAC Parav Pandit
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 62+ messages in thread
From: Parav Pandit @ 2021-06-16 19:11 UTC (permalink / raw)
  To: virtualization; +Cc: elic, mst

Enable user to set the mac address and mtu so that each vdpa device
can have its own user specified mac address and mtu.
This is done by implementing the management device's configuration
layout fields setting callback routine.

Now that user is enabled to set the mac address, remove the module
parameter for same.

And example of setting mac addr and mtu:
$ vdpa mgmtdev show

$ vdpa dev add name bar mgmtdev vdpasim_net
$ vdpa dev config set bar mac 00:11:22:33:44:55 mtu 9000

View the config after setting:
$ vdpa dev config show
bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000 speed 0 duplex 0

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
---
changelog:
v1->v2:
 - using updated interface and callbacks for get/set config
---
 drivers/vdpa/vdpa_sim/vdpa_sim.c     | 13 +++++++++++
 drivers/vdpa/vdpa_sim/vdpa_sim.h     |  2 ++
 drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 34 +++++++++++++++-------------
 3 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 98f793bc9376..e57cd1ff47e3 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -497,6 +497,17 @@ static void vdpasim_set_config(struct vdpa_device *vdpa, unsigned int offset,
 		vdpasim->dev_attr.set_config(vdpasim, vdpasim->config);
 }
 
+static int vdpasim_setup_config(struct vdpa_device *vdpa,
+				const struct vdpa_dev_set_config *config)
+{
+	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
+
+	if (!vdpasim->dev_attr.setup_config)
+		return -EOPNOTSUPP;
+
+	return vdpasim->dev_attr.setup_config(vdpasim, config);
+}
+
 static u32 vdpasim_get_generation(struct vdpa_device *vdpa)
 {
 	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
@@ -608,6 +619,7 @@ static const struct vdpa_config_ops vdpasim_config_ops = {
 	.get_config_size        = vdpasim_get_config_size,
 	.get_config             = vdpasim_get_config,
 	.set_config             = vdpasim_set_config,
+	.setup_config		= vdpasim_setup_config,
 	.get_generation         = vdpasim_get_generation,
 	.get_iova_range         = vdpasim_get_iova_range,
 	.dma_map                = vdpasim_dma_map,
@@ -636,6 +648,7 @@ static const struct vdpa_config_ops vdpasim_batch_config_ops = {
 	.get_config_size        = vdpasim_get_config_size,
 	.get_config             = vdpasim_get_config,
 	.set_config             = vdpasim_set_config,
+	.setup_config		= vdpasim_setup_config,
 	.get_generation         = vdpasim_get_generation,
 	.get_iova_range         = vdpasim_get_iova_range,
 	.set_map                = vdpasim_set_map,
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
index cd58e888bcf3..395894635010 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
@@ -45,6 +45,8 @@ struct vdpasim_dev_attr {
 	work_func_t work_fn;
 	void (*get_config)(struct vdpasim *vdpasim, void *config);
 	void (*set_config)(struct vdpasim *vdpasim, const void *config);
+	int (*setup_config)(struct vdpasim *vdpasim,
+			    const struct vdpa_dev_set_config *config);
 };
 
 /* State of each vdpasim device */
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
index a1ab6163f7d1..5fcee88a89c5 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -29,12 +29,6 @@
 
 #define VDPASIM_NET_VQ_NUM	2
 
-static char *macaddr;
-module_param(macaddr, charp, 0);
-MODULE_PARM_DESC(macaddr, "Ethernet MAC address");
-
-static u8 macaddr_buf[ETH_ALEN];
-
 static void vdpasim_net_work(struct work_struct *work)
 {
 	struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
@@ -112,9 +106,19 @@ static void vdpasim_net_get_config(struct vdpasim *vdpasim, void *config)
 {
 	struct virtio_net_config *net_config = config;
 
-	net_config->mtu = cpu_to_vdpasim16(vdpasim, 1500);
 	net_config->status = cpu_to_vdpasim16(vdpasim, VIRTIO_NET_S_LINK_UP);
-	memcpy(net_config->mac, macaddr_buf, ETH_ALEN);
+}
+
+static int vdpasim_net_setup_config(struct vdpasim *vdpasim,
+				    const struct vdpa_dev_set_config *config)
+{
+	struct virtio_net_config *vio_config = vdpasim->config;
+
+	if (config->net_mask.mac_valid)
+		memcpy(vio_config->mac, config->net.mac, ETH_ALEN);
+	if (config->net_mask.mtu_valid)
+		vio_config->mtu = cpu_to_vdpasim16(vdpasim, config->net.mtu);
+	return 0;
 }
 
 static void vdpasim_net_mgmtdev_release(struct device *dev)
@@ -128,6 +132,7 @@ static struct device vdpasim_net_mgmtdev = {
 
 static int vdpasim_net_dev_add(struct vdpa_mgmt_dev *mdev, const char *name)
 {
+	struct virtio_net_config *vio_config;
 	struct vdpasim_dev_attr dev_attr = {};
 	struct vdpasim *simdev;
 	int ret;
@@ -139,6 +144,7 @@ static int vdpasim_net_dev_add(struct vdpa_mgmt_dev *mdev, const char *name)
 	dev_attr.nvqs = VDPASIM_NET_VQ_NUM;
 	dev_attr.config_size = sizeof(struct virtio_net_config);
 	dev_attr.get_config = vdpasim_net_get_config;
+	dev_attr.setup_config = vdpasim_net_setup_config;
 	dev_attr.work_fn = vdpasim_net_work;
 	dev_attr.buffer_size = PAGE_SIZE;
 
@@ -146,6 +152,10 @@ static int vdpasim_net_dev_add(struct vdpa_mgmt_dev *mdev, const char *name)
 	if (IS_ERR(simdev))
 		return PTR_ERR(simdev);
 
+	vio_config = simdev->config;
+	/* Setup default MTU to be 1500 */
+	vio_config->mtu = cpu_to_le16(1500);
+
 	ret = _vdpa_register_device(&simdev->vdpa, VDPASIM_NET_VQ_NUM);
 	if (ret)
 		goto reg_err;
@@ -185,14 +195,6 @@ static int __init vdpasim_net_init(void)
 {
 	int ret;
 
-	if (macaddr) {
-		mac_pton(macaddr, macaddr_buf);
-		if (!is_valid_ether_addr(macaddr_buf))
-			return -EADDRNOTAVAIL;
-	} else {
-		eth_random_addr(macaddr_buf);
-	}
-
 	ret = device_register(&vdpasim_net_mgmtdev);
 	if (ret)
 		return ret;
-- 
2.26.2

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH linux-next v3 5/6] vdpa/mlx5: Support configuration of MAC
  2021-06-16 19:11 [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu Parav Pandit
                   ` (3 preceding siblings ...)
  2021-06-16 19:11 ` [PATCH linux-next v3 4/6] vdpa_sim_net: Enable user to set mac address and mtu Parav Pandit
@ 2021-06-16 19:11 ` Parav Pandit
  2021-06-16 19:11 ` [PATCH linux-next v3 6/6] vdpa/mlx5: Forward only packets with allowed MAC address Parav Pandit
  2021-08-05  9:57 ` [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu Michael S. Tsirkin
  6 siblings, 0 replies; 62+ messages in thread
From: Parav Pandit @ 2021-06-16 19:11 UTC (permalink / raw)
  To: virtualization; +Cc: elic, mst

From: Eli Cohen <elic@nvidia.com>

Add code to accept MAC configuration through vdpa tool. The MAC is
written into the config struct and later can be retrieved through
get_config().

Examples:
1. Configure MAC:
$ vdpa dev config set vdpa0 mac 00:11:22:33:44:55

2. Show configured params:
$ vdpa dev config show
vdpa0: mac 00:11:22:33:44:55 link down link_announce false mtu 0 speed 0 duplex 0

Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
---
changelog:
v2->v3:
 - following new api for config space setup for mgmt tool
v1->v2:
 - following new api for config get/set for mgmt tool
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index dda5dc6f7737..7f3d09f201fc 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1831,6 +1831,30 @@ static void mlx5_vdpa_set_config(struct vdpa_device *vdev, unsigned int offset,
 	/* not supported */
 }
 
+static int mlx5_vdpa_setup_config(struct vdpa_device *vdev,
+				  const struct vdpa_dev_set_config *config)
+{
+	struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+	struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+	int err = 0;
+
+	mutex_lock(&ndev->reslock);
+	if (ndev->setup)
+		err = -EBUSY;
+	mutex_unlock(&ndev->reslock);
+
+	if (err)
+		return err;
+
+	if (config->net_mask.mtu_valid)
+		return -EOPNOTSUPP;
+
+	if (config->net_mask.mac_valid)
+		memcpy(ndev->config.mac, config->net.mac, ETH_ALEN);
+
+	return 0;
+}
+
 static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
 {
 	struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
@@ -1909,6 +1933,7 @@ static const struct vdpa_config_ops mlx5_vdpa_ops = {
 	.get_config_size = mlx5_vdpa_get_config_size,
 	.get_config = mlx5_vdpa_get_config,
 	.set_config = mlx5_vdpa_set_config,
+	.setup_config = mlx5_vdpa_setup_config,
 	.get_generation = mlx5_vdpa_get_generation,
 	.set_map = mlx5_vdpa_set_map,
 	.free = mlx5_vdpa_free,
-- 
2.26.2

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH linux-next v3 6/6] vdpa/mlx5: Forward only packets with allowed MAC address
  2021-06-16 19:11 [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu Parav Pandit
                   ` (4 preceding siblings ...)
  2021-06-16 19:11 ` [PATCH linux-next v3 5/6] vdpa/mlx5: Support configuration of MAC Parav Pandit
@ 2021-06-16 19:11 ` Parav Pandit
  2021-08-05  9:57 ` [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu Michael S. Tsirkin
  6 siblings, 0 replies; 62+ messages in thread
From: Parav Pandit @ 2021-06-16 19:11 UTC (permalink / raw)
  To: virtualization; +Cc: elic, mst

From: Eli Cohen <elic@nvidia.com>

Add rules to forward packets to the net device's TIR only if the
destination MAC is equal to the configured MAC. This is required to
prevent the netdevice from receiving traffic not destined to its
configured MAC.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 76 +++++++++++++++++++++++--------
 1 file changed, 58 insertions(+), 18 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 7f3d09f201fc..f7c8c34e76e9 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -148,7 +148,8 @@ struct mlx5_vdpa_net {
 	struct mutex reslock;
 	struct mlx5_flow_table *rxft;
 	struct mlx5_fc *rx_counter;
-	struct mlx5_flow_handle *rx_rule;
+	struct mlx5_flow_handle *rx_rule_ucast;
+	struct mlx5_flow_handle *rx_rule_mcast;
 	bool setup;
 	u16 mtu;
 };
@@ -1296,21 +1297,33 @@ static int add_fwd_to_tir(struct mlx5_vdpa_net *ndev)
 	struct mlx5_flow_table_attr ft_attr = {};
 	struct mlx5_flow_act flow_act = {};
 	struct mlx5_flow_namespace *ns;
+	struct mlx5_flow_spec *spec;
+	void *headers_c;
+	void *headers_v;
+	u8 *dmac_c;
+	u8 *dmac_v;
 	int err;
 
-	/* for now, one entry, match all, forward to tir */
-	ft_attr.max_fte = 1;
-	ft_attr.autogroup.max_num_groups = 1;
+	spec = kvzalloc(sizeof(*spec), GFP_KERNEL);
+	if (!spec)
+		return -ENOMEM;
+
+	spec->match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+	ft_attr.max_fte = 2;
+	ft_attr.autogroup.max_num_groups = 2;
 
 	ns = mlx5_get_flow_namespace(ndev->mvdev.mdev, MLX5_FLOW_NAMESPACE_BYPASS);
 	if (!ns) {
-		mlx5_vdpa_warn(&ndev->mvdev, "get flow namespace\n");
-		return -EOPNOTSUPP;
+		mlx5_vdpa_warn(&ndev->mvdev, "failed to get flow namespace\n");
+		err = -EOPNOTSUPP;
+		goto err_ns;
 	}
 
 	ndev->rxft = mlx5_create_auto_grouped_flow_table(ns, &ft_attr);
-	if (IS_ERR(ndev->rxft))
-		return PTR_ERR(ndev->rxft);
+	if (IS_ERR(ndev->rxft)) {
+		err = PTR_ERR(ndev->rxft);
+		goto err_ns;
+	}
 
 	ndev->rx_counter = mlx5_fc_create(ndev->mvdev.mdev, false);
 	if (IS_ERR(ndev->rx_counter)) {
@@ -1318,37 +1331,64 @@ static int add_fwd_to_tir(struct mlx5_vdpa_net *ndev)
 		goto err_fc;
 	}
 
+	headers_c = MLX5_ADDR_OF(fte_match_param, spec->match_criteria, outer_headers);
+	dmac_c = MLX5_ADDR_OF(fte_match_param, headers_c, outer_headers.dmac_47_16);
+	memset(dmac_c, 0xff, ETH_ALEN);
+	headers_v = MLX5_ADDR_OF(fte_match_param, spec->match_value, outer_headers);
+	dmac_v = MLX5_ADDR_OF(fte_match_param, headers_v, outer_headers.dmac_47_16);
+	ether_addr_copy(dmac_v, ndev->config.mac);
+
 	flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | MLX5_FLOW_CONTEXT_ACTION_COUNT;
 	dest[0].type = MLX5_FLOW_DESTINATION_TYPE_TIR;
 	dest[0].tir_num = ndev->res.tirn;
 	dest[1].type = MLX5_FLOW_DESTINATION_TYPE_COUNTER;
 	dest[1].counter_id = mlx5_fc_id(ndev->rx_counter);
-	ndev->rx_rule = mlx5_add_flow_rules(ndev->rxft, NULL, &flow_act, dest, 2);
-	if (IS_ERR(ndev->rx_rule)) {
-		err = PTR_ERR(ndev->rx_rule);
-		ndev->rx_rule = NULL;
-		goto err_rule;
+	ndev->rx_rule_ucast = mlx5_add_flow_rules(ndev->rxft, spec, &flow_act, dest, 2);
+
+	if (IS_ERR(ndev->rx_rule_ucast)) {
+		err = PTR_ERR(ndev->rx_rule_ucast);
+		ndev->rx_rule_ucast = NULL;
+		goto err_rule_ucast;
+	}
+
+	memset(dmac_c, 0, ETH_ALEN);
+	memset(dmac_v, 0, ETH_ALEN);
+	dmac_c[0] = 1;
+	dmac_v[0] = 1;
+	flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
+	ndev->rx_rule_mcast = mlx5_add_flow_rules(ndev->rxft, spec, &flow_act, dest, 1);
+	if (IS_ERR(ndev->rx_rule_mcast)) {
+		err = PTR_ERR(ndev->rx_rule_mcast);
+		ndev->rx_rule_mcast = NULL;
+		goto err_rule_mcast;
 	}
 
+	kvfree(spec);
 	return 0;
 
-err_rule:
+err_rule_mcast:
+	mlx5_del_flow_rules(ndev->rx_rule_ucast);
+	ndev->rx_rule_ucast = NULL;
+err_rule_ucast:
 	mlx5_fc_destroy(ndev->mvdev.mdev, ndev->rx_counter);
 err_fc:
 	mlx5_destroy_flow_table(ndev->rxft);
+err_ns:
+	kvfree(spec);
 	return err;
 }
 
 static void remove_fwd_to_tir(struct mlx5_vdpa_net *ndev)
 {
-	if (!ndev->rx_rule)
+	if (!ndev->rx_rule_ucast)
 		return;
 
-	mlx5_del_flow_rules(ndev->rx_rule);
+	mlx5_del_flow_rules(ndev->rx_rule_mcast);
+	ndev->rx_rule_mcast = NULL;
+	mlx5_del_flow_rules(ndev->rx_rule_ucast);
+	ndev->rx_rule_ucast = NULL;
 	mlx5_fc_destroy(ndev->mvdev.mdev, ndev->rx_counter);
 	mlx5_destroy_flow_table(ndev->rxft);
-
-	ndev->rx_rule = NULL;
 }
 
 static void mlx5_vdpa_kick_vq(struct vdpa_device *vdev, u16 idx)
-- 
2.26.2

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 1/6] vdpa: Introduce and use vdpa device get, set config helpers
  2021-06-16 19:11 ` [PATCH linux-next v3 1/6] vdpa: Introduce and use vdpa device get, set config helpers Parav Pandit
@ 2021-06-22  7:08   ` Jason Wang
  0 siblings, 0 replies; 62+ messages in thread
From: Jason Wang @ 2021-06-22  7:08 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: elic, mst


在 2021/6/17 上午3:11, Parav Pandit 写道:
> Subsequent patches enable get and set configuration either
> via management device or via vdpa device' config ops.
>
> This requires synchronization between multiple callers to get and set
> config callbacks. Features setting also influence the layout of the
> configuration fields endianness.
>
> To avoid exposing synchronization primitives to callers, introduce
> helper for setting the configuration and use it.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Eli Cohen <elic@nvidia.com>


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
> changelog:
> v1->v2
>   - new patch to have synchronized access to features and config space
> ---
>   drivers/vdpa/vdpa.c  | 36 ++++++++++++++++++++++++++++++++++++
>   drivers/vhost/vdpa.c |  3 +--
>   include/linux/vdpa.h | 18 ++++--------------
>   3 files changed, 41 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index bb3f1d1f0422..bc44cdc34114 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -284,6 +284,42 @@ void vdpa_mgmtdev_unregister(struct vdpa_mgmt_dev *mdev)
>   }
>   EXPORT_SYMBOL_GPL(vdpa_mgmtdev_unregister);
>   
> +/**
> + * vdpa_get_config - Get one or more device configuration fields.
> + * @vdev: vdpa device to operate on
> + * @offset: starting byte offset of the field
> + * @buf: buffer pointer to read to
> + * @len: length of the configuration fields in bytes
> + */
> +void vdpa_get_config(struct vdpa_device *vdev, unsigned int offset,
> +		     void *buf, unsigned int len)
> +{
> +	const struct vdpa_config_ops *ops = vdev->config;
> +
> +	/*
> +	 * Config accesses aren't supposed to trigger before features are set.
> +	 * If it does happen we assume a legacy guest.
> +	 */
> +	if (!vdev->features_valid)
> +		vdpa_set_features(vdev, 0);
> +	ops->get_config(vdev, offset, buf, len);
> +}
> +EXPORT_SYMBOL_GPL(vdpa_get_config);
> +
> +/**
> + * vdpa_set_config - Set one or more device configuration fields.
> + * @vdev: vdpa device to operate on
> + * @offset: starting byte offset of the field
> + * @buf: buffer pointer to read from
> + * @length: length of the configuration fields in bytes
> + */
> +void vdpa_set_config(struct vdpa_device *vdev, unsigned int offset,
> +		     void *buf, unsigned int length)
> +{
> +	vdev->config->set_config(vdev, offset, buf, length);
> +}
> +EXPORT_SYMBOL_GPL(vdpa_set_config);
> +
>   static bool mgmtdev_handle_match(const struct vdpa_mgmt_dev *mdev,
>   				 const char *busname, const char *devname)
>   {
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index fb41db3da611..908b4fb251b3 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -231,7 +231,6 @@ static long vhost_vdpa_set_config(struct vhost_vdpa *v,
>   				  struct vhost_vdpa_config __user *c)
>   {
>   	struct vdpa_device *vdpa = v->vdpa;
> -	const struct vdpa_config_ops *ops = vdpa->config;
>   	struct vhost_vdpa_config config;
>   	unsigned long size = offsetof(struct vhost_vdpa_config, buf);
>   	u8 *buf;
> @@ -245,7 +244,7 @@ static long vhost_vdpa_set_config(struct vhost_vdpa *v,
>   	if (IS_ERR(buf))
>   		return PTR_ERR(buf);
>   
> -	ops->set_config(vdpa, config.off, buf, config.len);
> +	vdpa_set_config(vdpa, config.off, buf, config.len);
>   
>   	kvfree(buf);
>   	return 0;
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index f311d227aa1b..993d99519452 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -332,20 +332,10 @@ static inline int vdpa_set_features(struct vdpa_device *vdev, u64 features)
>           return ops->set_features(vdev, features);
>   }
>   
> -
> -static inline void vdpa_get_config(struct vdpa_device *vdev, unsigned offset,
> -				   void *buf, unsigned int len)
> -{
> -        const struct vdpa_config_ops *ops = vdev->config;
> -
> -	/*
> -	 * Config accesses aren't supposed to trigger before features are set.
> -	 * If it does happen we assume a legacy guest.
> -	 */
> -	if (!vdev->features_valid)
> -		vdpa_set_features(vdev, 0);
> -	ops->get_config(vdev, offset, buf, len);
> -}
> +void vdpa_get_config(struct vdpa_device *vdev, unsigned int offset,
> +		     void *buf, unsigned int len);
> +void vdpa_set_config(struct vdpa_device *dev, unsigned int offset,
> +		     void *buf, unsigned int length);
>   
>   /**
>    * struct vdpa_mgmtdev_ops - vdpa device ops

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-16 19:11 ` [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout Parav Pandit
@ 2021-06-22  7:20   ` Jason Wang
  2021-06-22 14:03     ` Parav Pandit
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-06-22  7:20 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: elic, mst


在 2021/6/17 上午3:11, Parav Pandit 写道:
> Introduce a command to query a device config layout.
>
> An example query of network vdpa device:
>
> $ vdpa dev add name bar mgmtdev vdpasim_net
>
> $ vdpa dev config show
> bar: mac 00:35:09:19:48:05 link up link_announce false mtu 1500 speed 0 duplex 0
>
> $ vdpa dev config show -jp
> {
>      "config": {
>          "bar": {
>              "mac": "00:35:09:19:48:05",
>              "link ": "up",
>              "link_announce ": false,
>              "mtu": 1500,
>              "speed": 0,
>              "duplex": 0
>          }
>      }
> }
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Eli Cohen <elic@nvidia.com>
> ---
>   drivers/vdpa/vdpa.c       | 212 ++++++++++++++++++++++++++++++++++++++
>   include/linux/vdpa.h      |   2 +
>   include/uapi/linux/vdpa.h |  11 ++
>   3 files changed, 225 insertions(+)
>
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index bc44cdc34114..1295528244c3 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -14,6 +14,8 @@
>   #include <uapi/linux/vdpa.h>
>   #include <net/genetlink.h>
>   #include <linux/mod_devicetable.h>
> +#include <linux/virtio_net.h>
> +#include <linux/virtio_ids.h>
>   
>   static LIST_HEAD(mdev_head);
>   /* A global mutex that protects vdpa management device and device level operations. */
> @@ -60,6 +62,7 @@ static void vdpa_release_dev(struct device *d)
>   		ops->free(vdev);
>   
>   	ida_simple_remove(&vdpa_index_ida, vdev->index);
> +	mutex_destroy(&vdev->cf_mutex);
>   	kfree(vdev);
>   }
>   
> @@ -114,6 +117,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
>   	if (err)
>   		goto err_name;
>   
> +	mutex_init(&vdev->cf_mutex);
>   	device_initialize(&vdev->dev);
>   
>   	return vdev;
> @@ -296,6 +300,7 @@ void vdpa_get_config(struct vdpa_device *vdev, unsigned int offset,
>   {
>   	const struct vdpa_config_ops *ops = vdev->config;
>   
> +	mutex_lock(&vdev->cf_mutex);
>   	/*
>   	 * Config accesses aren't supposed to trigger before features are set.
>   	 * If it does happen we assume a legacy guest.
> @@ -303,6 +308,7 @@ void vdpa_get_config(struct vdpa_device *vdev, unsigned int offset,
>   	if (!vdev->features_valid)
>   		vdpa_set_features(vdev, 0);
>   	ops->get_config(vdev, offset, buf, len);
> +	mutex_unlock(&vdev->cf_mutex);
>   }
>   EXPORT_SYMBOL_GPL(vdpa_get_config);
>   
> @@ -316,7 +322,9 @@ EXPORT_SYMBOL_GPL(vdpa_get_config);
>   void vdpa_set_config(struct vdpa_device *vdev, unsigned int offset,
>   		     void *buf, unsigned int length)
>   {
> +	mutex_lock(&vdev->cf_mutex);
>   	vdev->config->set_config(vdev, offset, buf, length);
> +	mutex_unlock(&vdev->cf_mutex);


I think it's better to use a separate patch to implement the 
synchronization in set_get()/get_config()


>   }
>   EXPORT_SYMBOL_GPL(vdpa_set_config);
>   
> @@ -643,6 +651,204 @@ static int vdpa_nl_cmd_dev_get_dumpit(struct sk_buff *msg, struct netlink_callba
>   	return msg->len;
>   }
>   
> +static int vdpa_dev_net_mq_config_fill(struct vdpa_device *vdev,
> +				       struct sk_buff *msg, u64 features,
> +				       const struct virtio_net_config *config)
> +{
> +	u16 val_u16;
> +
> +	if ((features & (1ULL << VIRTIO_NET_F_MQ)) == 0)
> +		return 0;
> +
> +	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
> +	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, val_u16);
> +}
> +
> +static int vdpa_dev_net_rss_config_fill(struct vdpa_device *vdev,
> +					struct sk_buff *msg, u64 features,
> +					const struct virtio_net_config *config)
> +{
> +	u16 val_u16;
> +	u16 val_u32;
> +
> +	if ((features & (1ULL << VIRTIO_NET_F_RSS)) == 0)
> +		return 0;
> +
> +	if (nla_put_u8(msg, VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,
> +		       config->rss_max_key_size))
> +		return -EMSGSIZE;
> +
> +	val_u16 = le16_to_cpu(config->rss_max_key_size);
> +	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN, val_u16))
> +		return -EMSGSIZE;
> +
> +	val_u32 = le32_to_cpu(config->supported_hash_types);
> +	if (nla_put_u32(msg, VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES, val_u32))
> +		return -EMSGSIZE;
> +	return 0;
> +}
> +
> +static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *msg)
> +{
> +	struct virtio_net_config config = {};
> +	u64 features;
> +	u32 val_u32;
> +	u16 val_u16;
> +	int err;
> +
> +	vdpa_get_config(vdev, 0, &config, sizeof(config));
> +
> +	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> +		    config.mac))
> +		return -EMSGSIZE;
> +
> +	val_u16 = le16_to_cpu(config.status);
> +	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> +		return -EMSGSIZE;


Note that status field only exist when VIRITO_NET_F_STATUS is 
negotiated. And if not, we need assume the link is up.


> +
> +	val_u16 = le16_to_cpu(config.mtu);
> +	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> +		return -EMSGSIZE;
> +
> +	val_u32 = le32_to_cpu(config.speed);
> +	if (nla_put_u32(msg, VDPA_ATTR_DEV_NET_CFG_SPEED, val_u32))
> +		return -EMSGSIZE;
> +
> +	if (nla_put_u8(msg, VDPA_ATTR_DEV_NET_CFG_DUPLEX, config.duplex))
> +		return -EMSGSIZE;


The above two only exists when VIRTIO_NET_F_SPEED_DUPLEX is negotiated.


> +
> +	features = vdev->config->get_features(vdev);
> +
> +	err = vdpa_dev_net_mq_config_fill(vdev, msg, features, &config);
> +	if (err)
> +		return err;
> +	return vdpa_dev_net_rss_config_fill(vdev, msg, features, &config);
> +}
> +
> +static int
> +vdpa_dev_config_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq,
> +		     int flags, struct netlink_ext_ack *extack)
> +{
> +	u32 device_id;
> +	void *hdr;
> +	int err;
> +
> +	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
> +			  VDPA_CMD_DEV_CONFIG_GET);
> +	if (!hdr)
> +		return -EMSGSIZE;
> +
> +	if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev))) {
> +		err = -EMSGSIZE;
> +		goto msg_err;
> +	}
> +
> +	device_id = vdev->config->get_device_id(vdev);
> +	if (nla_put_u32(msg, VDPA_ATTR_DEV_ID, device_id)) {
> +		err = -EMSGSIZE;
> +		goto msg_err;
> +	}
> +
> +	switch (device_id) {
> +	case VIRTIO_ID_NET:
> +		err = vdpa_dev_net_config_fill(vdev, msg);
> +		break;
> +	default:
> +		err = -EOPNOTSUPP;
> +		break;
> +	}
> +	if (err)
> +		goto msg_err;
> +
> +	genlmsg_end(msg, hdr);
> +	return 0;
> +
> +msg_err:
> +	genlmsg_cancel(msg, hdr);
> +	return err;
> +}
> +
> +static int vdpa_nl_cmd_dev_config_get_doit(struct sk_buff *skb, struct genl_info *info)
> +{
> +	struct vdpa_device *vdev;
> +	struct sk_buff *msg;
> +	const char *devname;
> +	struct device *dev;
> +	int err;
> +
> +	if (!info->attrs[VDPA_ATTR_DEV_NAME])
> +		return -EINVAL;
> +	devname = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
> +	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
> +	if (!msg)
> +		return -ENOMEM;
> +
> +	mutex_lock(&vdpa_dev_mutex);
> +	dev = bus_find_device(&vdpa_bus, NULL, devname, vdpa_name_match);
> +	if (!dev) {
> +		NL_SET_ERR_MSG_MOD(info->extack, "device not found");
> +		err = -ENODEV;
> +		goto dev_err;
> +	}
> +	vdev = container_of(dev, struct vdpa_device, dev);
> +	if (!vdev->mdev) {
> +		NL_SET_ERR_MSG_MOD(info->extack, "unmanaged vdpa device");
> +		err = -EINVAL;
> +		goto mdev_err;
> +	}
> +	err = vdpa_dev_config_fill(vdev, msg, info->snd_portid, info->snd_seq,
> +				   0, info->extack);
> +	if (!err)
> +		err = genlmsg_reply(msg, info);
> +
> +mdev_err:
> +	put_device(dev);
> +dev_err:
> +	mutex_unlock(&vdpa_dev_mutex);
> +	if (err)
> +		nlmsg_free(msg);
> +	return err;
> +}
> +
> +static int vdpa_dev_config_dump(struct device *dev, void *data)
> +{
> +	struct vdpa_device *vdev = container_of(dev, struct vdpa_device, dev);
> +	struct vdpa_dev_dump_info *info = data;
> +	int err;
> +
> +	if (!vdev->mdev)
> +		return 0;
> +	if (info->idx < info->start_idx) {
> +		info->idx++;
> +		return 0;
> +	}
> +	err = vdpa_dev_config_fill(vdev, info->msg, NETLINK_CB(info->cb->skb).portid,
> +				   info->cb->nlh->nlmsg_seq, NLM_F_MULTI,
> +				   info->cb->extack);
> +	if (err)
> +		return err;
> +
> +	info->idx++;
> +	return 0;
> +}
> +
> +static int
> +vdpa_nl_cmd_dev_config_get_dumpit(struct sk_buff *msg, struct netlink_callback *cb)
> +{
> +	struct vdpa_dev_dump_info info;
> +
> +	info.msg = msg;
> +	info.cb = cb;
> +	info.start_idx = cb->args[0];
> +	info.idx = 0;
> +
> +	mutex_lock(&vdpa_dev_mutex);
> +	bus_for_each_dev(&vdpa_bus, NULL, &info, vdpa_dev_config_dump);
> +	mutex_unlock(&vdpa_dev_mutex);
> +	cb->args[0] = info.idx;
> +	return msg->len;
> +}
> +
>   static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX + 1] = {
>   	[VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type = NLA_NUL_STRING },
>   	[VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING },
> @@ -674,6 +880,12 @@ static const struct genl_ops vdpa_nl_ops[] = {
>   		.doit = vdpa_nl_cmd_dev_get_doit,
>   		.dumpit = vdpa_nl_cmd_dev_get_dumpit,
>   	},
> +	{
> +		.cmd = VDPA_CMD_DEV_CONFIG_GET,
> +		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
> +		.doit = vdpa_nl_cmd_dev_config_get_doit,
> +		.dumpit = vdpa_nl_cmd_dev_config_get_dumpit,
> +	},
>   };
>   
>   static struct genl_family vdpa_nl_family __ro_after_init = {
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index 993d99519452..bf104f9f461a 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -42,6 +42,7 @@ struct vdpa_mgmt_dev;
>    * @dev: underlying device
>    * @dma_dev: the actual device that is performing DMA
>    * @config: the configuration ops for this device.
> + * @cf_mutex: Protects get and set access to features and configuration layout.
>    * @index: device index
>    * @features_valid: were features initialized? for legacy guests
>    * @nvqs: maximum number of supported virtqueues
> @@ -52,6 +53,7 @@ struct vdpa_device {
>   	struct device dev;
>   	struct device *dma_dev;
>   	const struct vdpa_config_ops *config;
> +	struct mutex cf_mutex; /* Protects get/set config and features */
>   	unsigned int index;
>   	bool features_valid;
>   	int nvqs;
> diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> index 66a41e4ec163..5c31ecc3b956 100644
> --- a/include/uapi/linux/vdpa.h
> +++ b/include/uapi/linux/vdpa.h
> @@ -17,6 +17,7 @@ enum vdpa_command {
>   	VDPA_CMD_DEV_NEW,
>   	VDPA_CMD_DEV_DEL,
>   	VDPA_CMD_DEV_GET,		/* can dump */
> +	VDPA_CMD_DEV_CONFIG_GET,	/* can dump */
>   };
>   
>   enum vdpa_attr {
> @@ -33,6 +34,16 @@ enum vdpa_attr {
>   	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
>   	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
>   
> +	VDPA_ATTR_DEV_NET_CFG_MACADDR,		/* binary */
> +	VDPA_ATTR_DEV_NET_STATUS,		/* u8 */
> +	VDPA_ATTR_DEV_NET_CFG_MAX_VQP,		/* u16 */
> +	VDPA_ATTR_DEV_NET_CFG_MTU,		/* u16 */
> +	VDPA_ATTR_DEV_NET_CFG_SPEED,		/* u16 */
> +	VDPA_ATTR_DEV_NET_CFG_DUPLEX,		/* u16 */
> +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,	/* u8 */
> +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,	/* u16 */
> +	VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,	/* u32 */


Is it better to use a separate enum for net specific attributes?

Another question (sorry if it has been asked before). Can we simply 
return the config (binary) to the userspace, then usespace can use the 
existing uAPI like virtio_net_config plus the feature to explain the config?

Thanks


> +
>   	/* new attributes must be added above here */
>   	VDPA_ATTR_MAX,
>   };

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 3/6] vdpa: Enable user to set mac and mtu of vdpa device
  2021-06-16 19:11 ` [PATCH linux-next v3 3/6] vdpa: Enable user to set mac and mtu of vdpa device Parav Pandit
@ 2021-06-22  7:43   ` Jason Wang
  2021-06-22 14:09     ` Parav Pandit
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-06-22  7:43 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: elic, mst


在 2021/6/17 上午3:11, Parav Pandit 写道:
> $ vdpa dev add name bar mgmtdev vdpasim_net
>
> $ vdpa dev config set bar mac 00:11:22:33:44:55 mtu 9000
>
> $ vdpa dev config show
> bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000 speed 0 duplex 0
>
> $ vdpa dev config show -jp
> {
>      "config": {
>          "bar": {
>              "mac": "00:11:22:33:44:55",
>              "link ": "up",
>              "link_announce ": false,
>              "mtu": 9000,
>              "speed": 0,
>              "duplex": 0
>          }
>      }
> }
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Eli Cohen <elic@nvidia.com>
> ---
> changelog:
> v2->v3:
>   - using new setup_config callback to setup device params via mgmt tool
>     to avoid mixing with existing set_config().
> ---
>   drivers/vdpa/vdpa.c       | 91 ++++++++++++++++++++++++++++++++++++++-
>   include/linux/vdpa.h      | 18 ++++++++
>   include/uapi/linux/vdpa.h |  1 +
>   3 files changed, 109 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index 1295528244c3..40874bd92126 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -14,7 +14,6 @@
>   #include <uapi/linux/vdpa.h>
>   #include <net/genetlink.h>
>   #include <linux/mod_devicetable.h>
> -#include <linux/virtio_net.h>
>   #include <linux/virtio_ids.h>
>   
>   static LIST_HEAD(mdev_head);
> @@ -849,10 +848,94 @@ vdpa_nl_cmd_dev_config_get_dumpit(struct sk_buff *msg, struct netlink_callback *
>   	return msg->len;
>   }
>   
> +static int vdpa_dev_net_config_set(struct vdpa_device *vdev,
> +				   struct sk_buff *skb, struct genl_info *info)
> +{
> +	struct nlattr **nl_attrs = info->attrs;
> +	struct vdpa_dev_set_config config = {};
> +	const u8 *macaddr;
> +	int err;
> +
> +	if (!netlink_capable(skb, CAP_NET_ADMIN))
> +		return -EPERM;


Interesting, I wonder how cap would be used for other type of devices 
(e.g block).


> +
> +	if (!vdev->config->setup_config)
> +		return -EOPNOTSUPP;
> +
> +	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]) {
> +		macaddr = nla_data(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]);
> +		memcpy(config.net.mac, macaddr, sizeof(config.net.mac));
> +		config.net_mask.mac_valid = true;
> +	}
> +	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]) {
> +		config.net.mtu =
> +			nla_get_u16(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]);
> +		config.net_mask.mtu_valid = true;
> +	}


Instead of doing memcpy and pass the whole config structure like this. I 
wonder if it's better to switch to use:

vdev->config->setup_config(vdev, offsetof(struct virtio_net_config, 
mtu), &mtu, sizeof(mtu));

Then there's no need for the vdpa_dev_set_config structure which will 
became structure virtio_net_config gradually.

The setup_config() can fail if the offset is not at the boundary of a 
specific attribute.

Thanks


> +
> +	mutex_lock(&vdev->cf_mutex);
> +	err = vdev->config->setup_config(vdev, &config);
> +	mutex_unlock(&vdev->cf_mutex);
> +	return err;
> +}
> +
> +static int vdpa_dev_config_set(struct vdpa_device *vdev, struct sk_buff *skb,
> +			       struct genl_info *info)
> +{
> +	int err = -EOPNOTSUPP;
> +	u32 device_id;
> +
> +	if (!vdev->mdev)
> +		return -EOPNOTSUPP;
> +
> +	device_id = vdev->config->get_device_id(vdev);
> +	switch (device_id) {
> +	case VIRTIO_ID_NET:
> +		err = vdpa_dev_net_config_set(vdev, skb, info);
> +		break;
> +	default:
> +		break;
> +	}
> +	return err;
> +}
> +
> +static int vdpa_nl_cmd_dev_config_set_doit(struct sk_buff *skb, struct genl_info *info)
> +{
> +	struct vdpa_device *vdev;
> +	const char *devname;
> +	struct device *dev;
> +	int err;
> +
> +	if (!info->attrs[VDPA_ATTR_DEV_NAME])
> +		return -EINVAL;
> +	devname = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
> +
> +	mutex_lock(&vdpa_dev_mutex);
> +	dev = bus_find_device(&vdpa_bus, NULL, devname, vdpa_name_match);
> +	if (!dev) {
> +		mutex_unlock(&vdpa_dev_mutex);
> +		NL_SET_ERR_MSG_MOD(info->extack, "device not found");
> +		return -ENODEV;
> +	}
> +	vdev = container_of(dev, struct vdpa_device, dev);
> +	if (!vdev->mdev) {
> +		mutex_unlock(&vdpa_dev_mutex);
> +		put_device(dev);
> +		return -EINVAL;
> +	}
> +	err = vdpa_dev_config_set(vdev, skb, info);
> +	put_device(dev);
> +	mutex_unlock(&vdpa_dev_mutex);
> +	return err;
> +}
> +
>   static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX + 1] = {
>   	[VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type = NLA_NUL_STRING },
>   	[VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING },
>   	[VDPA_ATTR_DEV_NAME] = { .type = NLA_STRING },
> +	[VDPA_ATTR_DEV_NET_CFG_MACADDR] = NLA_POLICY_ETH_ADDR,
> +	/* virtio spec 1.1 section 5.1.4.1 for valid MTU range */
> +	[VDPA_ATTR_DEV_NET_CFG_MTU] = NLA_POLICY_MIN(NLA_U16, 68),
>   };
>   
>   static const struct genl_ops vdpa_nl_ops[] = {
> @@ -886,6 +969,12 @@ static const struct genl_ops vdpa_nl_ops[] = {
>   		.doit = vdpa_nl_cmd_dev_config_get_doit,
>   		.dumpit = vdpa_nl_cmd_dev_config_get_dumpit,
>   	},
> +	{
> +		.cmd = VDPA_CMD_DEV_CONFIG_SET,
> +		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
> +		.doit = vdpa_nl_cmd_dev_config_set_doit,
> +		.flags = GENL_ADMIN_PERM,
> +	},
>   };
>   
>   static struct genl_family vdpa_nl_family __ro_after_init = {
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index bf104f9f461a..9b7238d5310e 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -6,6 +6,8 @@
>   #include <linux/device.h>
>   #include <linux/interrupt.h>
>   #include <linux/vhost_iotlb.h>
> +#include <linux/virtio_net.h>
> +#include <linux/if_ether.h>
>   
>   /**
>    * struct vdpa_calllback - vDPA callback definition.
> @@ -70,6 +72,17 @@ struct vdpa_iova_range {
>   	u64 last;
>   };
>   
> +struct vdpa_dev_set_config {
> +	struct {
> +		u8 mac[ETH_ALEN];
> +		u16 mtu;
> +	} net;
> +	struct {
> +		u8 mac_valid: 1;
> +		u8 mtu_valid: 1;
> +	} net_mask;
> +};
> +
>   /**
>    * struct vdpa_config_ops - operations for configuring a vDPA device.
>    * Note: vDPA device drivers are required to implement all of the
> @@ -169,6 +182,9 @@ struct vdpa_iova_range {
>    *				@buf: buffer used to write from
>    *				@len: the length to write to
>    *				configuration space
> + * @setup_config:		Setup configuration space
> + *				@vdev: vdpa device
> + *				#config: configuration to apply to device
>    * @get_generation:		Get device config generation (optional)
>    *				@vdev: vdpa device
>    *				Returns u32: device generation
> @@ -241,6 +257,8 @@ struct vdpa_config_ops {
>   			   void *buf, unsigned int len);
>   	void (*set_config)(struct vdpa_device *vdev, unsigned int offset,
>   			   const void *buf, unsigned int len);
> +	int (*setup_config)(struct vdpa_device *vdev,
> +			    const struct vdpa_dev_set_config *config);
>   	u32 (*get_generation)(struct vdpa_device *vdev);
>   	struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
>   
> diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> index 5c31ecc3b956..ec349789b8d1 100644
> --- a/include/uapi/linux/vdpa.h
> +++ b/include/uapi/linux/vdpa.h
> @@ -18,6 +18,7 @@ enum vdpa_command {
>   	VDPA_CMD_DEV_DEL,
>   	VDPA_CMD_DEV_GET,		/* can dump */
>   	VDPA_CMD_DEV_CONFIG_GET,	/* can dump */
> +	VDPA_CMD_DEV_CONFIG_SET,
>   };
>   
>   enum vdpa_attr {

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-22  7:20   ` Jason Wang
@ 2021-06-22 14:03     ` Parav Pandit
  2021-06-23  4:08       ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-06-22 14:03 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: Eli Cohen, mst



> From: Jason Wang <jasowang@redhat.com>
> Sent: Tuesday, June 22, 2021 12:50 PM
> 

[..]
> >   {
> >   	const struct vdpa_config_ops *ops = vdev->config;
> >
> > +	mutex_lock(&vdev->cf_mutex);
> >   	/*
> >   	 * Config accesses aren't supposed to trigger before features are set.
> >   	 * If it does happen we assume a legacy guest.
> > @@ -303,6 +308,7 @@ void vdpa_get_config(struct vdpa_device *vdev,
> unsigned int offset,
> >   	if (!vdev->features_valid)
> >   		vdpa_set_features(vdev, 0);
> >   	ops->get_config(vdev, offset, buf, len);
> > +	mutex_unlock(&vdev->cf_mutex);
> >   }
> >   EXPORT_SYMBOL_GPL(vdpa_get_config);
> >
> > @@ -316,7 +322,9 @@ EXPORT_SYMBOL_GPL(vdpa_get_config);
> >   void vdpa_set_config(struct vdpa_device *vdev, unsigned int offset,
> >   		     void *buf, unsigned int length)
> >   {
> > +	mutex_lock(&vdev->cf_mutex);
> >   	vdev->config->set_config(vdev, offset, buf, length);
> > +	mutex_unlock(&vdev->cf_mutex);
> 
> 
> I think it's better to use a separate patch to implement the synchronization in
> set_get()/get_config()
> 
Ok. I will split.

> 
> >   }
> >   EXPORT_SYMBOL_GPL(vdpa_set_config);
> >
> > @@ -643,6 +651,204 @@ static int vdpa_nl_cmd_dev_get_dumpit(struct
> sk_buff *msg, struct netlink_callba
> >   	return msg->len;
> >   }
> >
> > +static int vdpa_dev_net_mq_config_fill(struct vdpa_device *vdev,
> > +				       struct sk_buff *msg, u64 features,
> > +				       const struct virtio_net_config *config)
> > +{
> > +	u16 val_u16;
> > +
> > +	if ((features & (1ULL << VIRTIO_NET_F_MQ)) == 0)
> > +		return 0;
> > +
> > +	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
> > +	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
> val_u16);
> > +}
> > +
> > +static int vdpa_dev_net_rss_config_fill(struct vdpa_device *vdev,
> > +					struct sk_buff *msg, u64 features,
> > +					const struct virtio_net_config
> *config)
> > +{
> > +	u16 val_u16;
> > +	u16 val_u32;
> > +
> > +	if ((features & (1ULL << VIRTIO_NET_F_RSS)) == 0)
> > +		return 0;
> > +
> > +	if (nla_put_u8(msg,
> VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,
> > +		       config->rss_max_key_size))
> > +		return -EMSGSIZE;
> > +
> > +	val_u16 = le16_to_cpu(config->rss_max_key_size);
> > +	if (nla_put_u16(msg,
> VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN, val_u16))
> > +		return -EMSGSIZE;
> > +
> > +	val_u32 = le32_to_cpu(config->supported_hash_types);
> > +	if (nla_put_u32(msg, VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,
> val_u32))
> > +		return -EMSGSIZE;
> > +	return 0;
> > +}
> > +
> > +static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct
> sk_buff *msg)
> > +{
> > +	struct virtio_net_config config = {};
> > +	u64 features;
> > +	u32 val_u32;
> > +	u16 val_u16;
> > +	int err;
> > +
> > +	vdpa_get_config(vdev, 0, &config, sizeof(config));
> > +
> > +	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR,
> sizeof(config.mac),
> > +		    config.mac))
> > +		return -EMSGSIZE;
> > +
> > +	val_u16 = le16_to_cpu(config.status);
> > +	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > +		return -EMSGSIZE;
> 
> 
> Note that status field only exist when VIRITO_NET_F_STATUS is
> negotiated. And if not, we need assume the link is up.
> 
Ok. will change.

> 
> > +
> > +	val_u16 = le16_to_cpu(config.mtu);
> > +	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > +		return -EMSGSIZE;
> > +
> > +	val_u32 = le32_to_cpu(config.speed);
> > +	if (nla_put_u32(msg, VDPA_ATTR_DEV_NET_CFG_SPEED, val_u32))
> > +		return -EMSGSIZE;
> > +
> > +	if (nla_put_u8(msg, VDPA_ATTR_DEV_NET_CFG_DUPLEX,
> config.duplex))
> > +		return -EMSGSIZE;
> 
> 
> The above two only exists when VIRTIO_NET_F_SPEED_DUPLEX is
> negotiated.
> 
I think I missed fixing this. You or Michael mentioned this in v2.
Will fix.

> 
> > +
> > +	features = vdev->config->get_features(vdev);
> > +
> > +	err = vdpa_dev_net_mq_config_fill(vdev, msg, features, &config);
> > +	if (err)
> > +		return err;
> > +	return vdpa_dev_net_rss_config_fill(vdev, msg, features, &config);
> > +}
> > +
> > +static int
> > +vdpa_dev_config_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32
> portid, u32 seq,
> > +		     int flags, struct netlink_ext_ack *extack)
> > +{
> > +	u32 device_id;
> > +	void *hdr;
> > +	int err;
> > +
> > +	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
> > +			  VDPA_CMD_DEV_CONFIG_GET);
> > +	if (!hdr)
> > +		return -EMSGSIZE;
> > +
> > +	if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev-
> >dev))) {
> > +		err = -EMSGSIZE;
> > +		goto msg_err;
> > +	}
> > +
> > +	device_id = vdev->config->get_device_id(vdev);
> > +	if (nla_put_u32(msg, VDPA_ATTR_DEV_ID, device_id)) {
> > +		err = -EMSGSIZE;
> > +		goto msg_err;
> > +	}
> > +
> > +	switch (device_id) {
> > +	case VIRTIO_ID_NET:
> > +		err = vdpa_dev_net_config_fill(vdev, msg);
> > +		break;
> > +	default:
> > +		err = -EOPNOTSUPP;
> > +		break;
> > +	}
> > +	if (err)
> > +		goto msg_err;
> > +
> > +	genlmsg_end(msg, hdr);
> > +	return 0;
> > +
> > +msg_err:
> > +	genlmsg_cancel(msg, hdr);
> > +	return err;
> > +}
> > +
> > +static int vdpa_nl_cmd_dev_config_get_doit(struct sk_buff *skb, struct
> genl_info *info)
> > +{
> > +	struct vdpa_device *vdev;
> > +	struct sk_buff *msg;
> > +	const char *devname;
> > +	struct device *dev;
> > +	int err;
> > +
> > +	if (!info->attrs[VDPA_ATTR_DEV_NAME])
> > +		return -EINVAL;
> > +	devname = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
> > +	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
> > +	if (!msg)
> > +		return -ENOMEM;
> > +
> > +	mutex_lock(&vdpa_dev_mutex);
> > +	dev = bus_find_device(&vdpa_bus, NULL, devname,
> vdpa_name_match);
> > +	if (!dev) {
> > +		NL_SET_ERR_MSG_MOD(info->extack, "device not found");
> > +		err = -ENODEV;
> > +		goto dev_err;
> > +	}
> > +	vdev = container_of(dev, struct vdpa_device, dev);
> > +	if (!vdev->mdev) {
> > +		NL_SET_ERR_MSG_MOD(info->extack, "unmanaged vdpa
> device");
> > +		err = -EINVAL;
> > +		goto mdev_err;
> > +	}
> > +	err = vdpa_dev_config_fill(vdev, msg, info->snd_portid, info-
> >snd_seq,
> > +				   0, info->extack);
> > +	if (!err)
> > +		err = genlmsg_reply(msg, info);
> > +
> > +mdev_err:
> > +	put_device(dev);
> > +dev_err:
> > +	mutex_unlock(&vdpa_dev_mutex);
> > +	if (err)
> > +		nlmsg_free(msg);
> > +	return err;
> > +}
> > +
> > +static int vdpa_dev_config_dump(struct device *dev, void *data)
> > +{
> > +	struct vdpa_device *vdev = container_of(dev, struct vdpa_device,
> dev);
> > +	struct vdpa_dev_dump_info *info = data;
> > +	int err;
> > +
> > +	if (!vdev->mdev)
> > +		return 0;
> > +	if (info->idx < info->start_idx) {
> > +		info->idx++;
> > +		return 0;
> > +	}
> > +	err = vdpa_dev_config_fill(vdev, info->msg, NETLINK_CB(info->cb-
> >skb).portid,
> > +				   info->cb->nlh->nlmsg_seq, NLM_F_MULTI,
> > +				   info->cb->extack);
> > +	if (err)
> > +		return err;
> > +
> > +	info->idx++;
> > +	return 0;
> > +}
> > +
> > +static int
> > +vdpa_nl_cmd_dev_config_get_dumpit(struct sk_buff *msg, struct
> netlink_callback *cb)
> > +{
> > +	struct vdpa_dev_dump_info info;
> > +
> > +	info.msg = msg;
> > +	info.cb = cb;
> > +	info.start_idx = cb->args[0];
> > +	info.idx = 0;
> > +
> > +	mutex_lock(&vdpa_dev_mutex);
> > +	bus_for_each_dev(&vdpa_bus, NULL, &info,
> vdpa_dev_config_dump);
> > +	mutex_unlock(&vdpa_dev_mutex);
> > +	cb->args[0] = info.idx;
> > +	return msg->len;
> > +}
> > +
> >   static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX + 1] = {
> >   	[VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type = NLA_NUL_STRING
> },
> >   	[VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING },
> > @@ -674,6 +880,12 @@ static const struct genl_ops vdpa_nl_ops[] = {
> >   		.doit = vdpa_nl_cmd_dev_get_doit,
> >   		.dumpit = vdpa_nl_cmd_dev_get_dumpit,
> >   	},
> > +	{
> > +		.cmd = VDPA_CMD_DEV_CONFIG_GET,
> > +		.validate = GENL_DONT_VALIDATE_STRICT |
> GENL_DONT_VALIDATE_DUMP,
> > +		.doit = vdpa_nl_cmd_dev_config_get_doit,
> > +		.dumpit = vdpa_nl_cmd_dev_config_get_dumpit,
> > +	},
> >   };
> >
> >   static struct genl_family vdpa_nl_family __ro_after_init = {
> > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > index 993d99519452..bf104f9f461a 100644
> > --- a/include/linux/vdpa.h
> > +++ b/include/linux/vdpa.h
> > @@ -42,6 +42,7 @@ struct vdpa_mgmt_dev;
> >    * @dev: underlying device
> >    * @dma_dev: the actual device that is performing DMA
> >    * @config: the configuration ops for this device.
> > + * @cf_mutex: Protects get and set access to features and configuration
> layout.
> >    * @index: device index
> >    * @features_valid: were features initialized? for legacy guests
> >    * @nvqs: maximum number of supported virtqueues
> > @@ -52,6 +53,7 @@ struct vdpa_device {
> >   	struct device dev;
> >   	struct device *dma_dev;
> >   	const struct vdpa_config_ops *config;
> > +	struct mutex cf_mutex; /* Protects get/set config and features */
> >   	unsigned int index;
> >   	bool features_valid;
> >   	int nvqs;
> > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > index 66a41e4ec163..5c31ecc3b956 100644
> > --- a/include/uapi/linux/vdpa.h
> > +++ b/include/uapi/linux/vdpa.h
> > @@ -17,6 +17,7 @@ enum vdpa_command {
> >   	VDPA_CMD_DEV_NEW,
> >   	VDPA_CMD_DEV_DEL,
> >   	VDPA_CMD_DEV_GET,		/* can dump */
> > +	VDPA_CMD_DEV_CONFIG_GET,	/* can dump */
> >   };
> >
> >   enum vdpa_attr {
> > @@ -33,6 +34,16 @@ enum vdpa_attr {
> >   	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
> >   	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
> >
> > +	VDPA_ATTR_DEV_NET_CFG_MACADDR,		/* binary */
> > +	VDPA_ATTR_DEV_NET_STATUS,		/* u8 */
> > +	VDPA_ATTR_DEV_NET_CFG_MAX_VQP,		/* u16 */
> > +	VDPA_ATTR_DEV_NET_CFG_MTU,		/* u16 */
> > +	VDPA_ATTR_DEV_NET_CFG_SPEED,		/* u16 */
> > +	VDPA_ATTR_DEV_NET_CFG_DUPLEX,		/* u16 */
> > +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,	/* u8 */
> > +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,	/* u16 */
> > +	VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,	/* u32 */
> 
> 
> Is it better to use a separate enum for net specific attributes?
> 
Yes, because they are only net specific.
I guess it is related to your below question.

> Another question (sorry if it has been asked before). Can we simply
> return the config (binary) to the userspace, then usespace can use the
> existing uAPI like virtio_net_config plus the feature to explain the config?
> 
We did discuss in v2.
Usually returning the whole blob and parsing is not desired via netlink.
Returning individual fields give the full flexibility to return only the valid fields.
Otherwise we need to implement another bitmask too to tell which fields from the struct are valid and share with user space.
Returning individual fields is the widely used approach.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 3/6] vdpa: Enable user to set mac and mtu of vdpa device
  2021-06-22  7:43   ` Jason Wang
@ 2021-06-22 14:09     ` Parav Pandit
  0 siblings, 0 replies; 62+ messages in thread
From: Parav Pandit @ 2021-06-22 14:09 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: Eli Cohen, mst



> From: Jason Wang <jasowang@redhat.com>
> Sent: Tuesday, June 22, 2021 1:13 PM
> 
> 在 2021/6/17 上午3:11, Parav Pandit 写道:
> > $ vdpa dev add name bar mgmtdev vdpasim_net
> >
> > $ vdpa dev config set bar mac 00:11:22:33:44:55 mtu 9000
> >
> > $ vdpa dev config show
> > bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000 speed
> > 0 duplex 0
> >
> > $ vdpa dev config show -jp
> > {
> >      "config": {
> >          "bar": {
> >              "mac": "00:11:22:33:44:55",
> >              "link ": "up",
> >              "link_announce ": false,
> >              "mtu": 9000,
> >              "speed": 0,
> >              "duplex": 0
> >          }
> >      }
> > }
> >
> > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > Reviewed-by: Eli Cohen <elic@nvidia.com>
> > ---
> > changelog:
> > v2->v3:
> >   - using new setup_config callback to setup device params via mgmt tool
> >     to avoid mixing with existing set_config().
> > ---
> >   drivers/vdpa/vdpa.c       | 91
> ++++++++++++++++++++++++++++++++++++++-
> >   include/linux/vdpa.h      | 18 ++++++++
> >   include/uapi/linux/vdpa.h |  1 +
> >   3 files changed, 109 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> > 1295528244c3..40874bd92126 100644
> > --- a/drivers/vdpa/vdpa.c
> > +++ b/drivers/vdpa/vdpa.c
> > @@ -14,7 +14,6 @@
> >   #include <uapi/linux/vdpa.h>
> >   #include <net/genetlink.h>
> >   #include <linux/mod_devicetable.h>
> > -#include <linux/virtio_net.h>
> >   #include <linux/virtio_ids.h>
> >
> >   static LIST_HEAD(mdev_head);
> > @@ -849,10 +848,94 @@ vdpa_nl_cmd_dev_config_get_dumpit(struct
> sk_buff *msg, struct netlink_callback *
> >   	return msg->len;
> >   }
> >
> > +static int vdpa_dev_net_config_set(struct vdpa_device *vdev,
> > +				   struct sk_buff *skb, struct genl_info *info) {
> > +	struct nlattr **nl_attrs = info->attrs;
> > +	struct vdpa_dev_set_config config = {};
> > +	const u8 *macaddr;
> > +	int err;
> > +
> > +	if (!netlink_capable(skb, CAP_NET_ADMIN))
> > +		return -EPERM;
> 
> 
> Interesting, I wonder how cap would be used for other type of devices (e.g
> block).
> 
> 
> > +
> > +	if (!vdev->config->setup_config)
> > +		return -EOPNOTSUPP;
> > +
> > +	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]) {
> > +		macaddr =
> nla_data(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]);
> > +		memcpy(config.net.mac, macaddr, sizeof(config.net.mac));
> > +		config.net_mask.mac_valid = true;
> > +	}
> > +	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]) {
> > +		config.net.mtu =
> > +
> 	nla_get_u16(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]);
> > +		config.net_mask.mtu_valid = true;
> > +	}
> 
> 
> Instead of doing memcpy and pass the whole config structure like this. I
> wonder if it's better to switch to use:
> 
> vdev->config->setup_config(vdev, offsetof(struct virtio_net_config,
> mtu), &mtu, sizeof(mtu));
> 
Well, we need a way to differentiate that the caller is management tool and not the vhost path.

Instead of passing some flag of the caller to setup_config(), a explicitly defined callback served better.

And secondly we need to return the error status. setup_config() cb is void. This is the minor one.

> Then there's no need for the vdpa_dev_set_config structure which will
> became structure virtio_net_config gradually.
> 
> The setup_config() can fail if the offset is not at the boundary of a
> specific attribute.
> 
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-22 14:03     ` Parav Pandit
@ 2021-06-23  4:08       ` Jason Wang
  2021-06-23  4:22         ` Parav Pandit
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-06-23  4:08 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: Eli Cohen, mst


在 2021/6/22 下午10:03, Parav Pandit 写道:
>> Is it better to use a separate enum for net specific attributes?
>>
> Yes, because they are only net specific.
> I guess it is related to your below question.
>
>> Another question (sorry if it has been asked before). Can we simply
>> return the config (binary) to the userspace, then usespace can use the
>> existing uAPI like virtio_net_config plus the feature to explain the config?
>>
> We did discuss in v2.
> Usually returning the whole blob and parsing is not desired via netlink.
> Returning individual fields give the full flexibility to return only the valid fields.
> Otherwise we need to implement another bitmask too to tell which fields from the struct are valid and share with user space.
> Returning individual fields is the widely used approach.


The main concerns are:

1) The blob will be self contained if it was passed with the negotiated 
features, so we don't need bitmask.
2) Using individual fields means it must duplicate the config fields of 
every virtio devices

And actually, it's not the binary blob since uapi clearly define the 
format (e.g struct virtio_net_config), can we find a way to use that?  
E.g introduce device/net specific command and passing the blob with 
length and negotiated features.

Thanks


>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-23  4:08       ` Jason Wang
@ 2021-06-23  4:22         ` Parav Pandit
  2021-06-24  5:43           ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-06-23  4:22 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: Eli Cohen, mst



> From: Jason Wang <jasowang@redhat.com>
> Sent: Wednesday, June 23, 2021 9:39 AM
> 
> 在 2021/6/22 下午10:03, Parav Pandit 写道:
> >> Is it better to use a separate enum for net specific attributes?
> >>
> > Yes, because they are only net specific.
> > I guess it is related to your below question.
> >
> >> Another question (sorry if it has been asked before). Can we simply
> >> return the config (binary) to the userspace, then usespace can use
> >> the existing uAPI like virtio_net_config plus the feature to explain the
> config?
> >>
> > We did discuss in v2.
> > Usually returning the whole blob and parsing is not desired via netlink.
> > Returning individual fields give the full flexibility to return only the valid
> fields.
> > Otherwise we need to implement another bitmask too to tell which fields
> from the struct are valid and share with user space.
> > Returning individual fields is the widely used approach.
> 
> 
> The main concerns are:
> 
> 1) The blob will be self contained if it was passed with the negotiated
> features, so we don't need bitmask.
Which fields of the struct are valid is told by additional fields.
> 2) Using individual fields means it must duplicate the config fields of every
> virtio devices
> 
Mostly no. if there are common config fields across two device types, they would be named as
VDPA_ATTR_DEV_CFG_*
Net specific will be,
VDPA_ATTR_DEV_NET_CFG_*
Block specific, will be,
VDPA_ATTR_DEV_BLK_CFG_*

> And actually, it's not the binary blob since uapi clearly define the format (e.g
> struct virtio_net_config), can we find a way to use that? E.g introduce
> device/net specific command and passing the blob with length and
> negotiated features.
Length may change in future, mostly expand. And parsing based on length is not such a clean way.
Parsing fields require knowledge of features as well and application needs to make multiple netlink calls to parse the config space.
I prefer to follow rest of the kernel style to return self contained invidividual fields.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-23  4:22         ` Parav Pandit
@ 2021-06-24  5:43           ` Jason Wang
  2021-06-24  6:29             ` Parav Pandit
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-06-24  5:43 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: Eli Cohen, mst


在 2021/6/23 下午12:22, Parav Pandit 写道:
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Wednesday, June 23, 2021 9:39 AM
>>
>> 在 2021/6/22 下午10:03, Parav Pandit 写道:
>>>> Is it better to use a separate enum for net specific attributes?
>>>>
>>> Yes, because they are only net specific.
>>> I guess it is related to your below question.
>>>
>>>> Another question (sorry if it has been asked before). Can we simply
>>>> return the config (binary) to the userspace, then usespace can use
>>>> the existing uAPI like virtio_net_config plus the feature to explain the
>> config?
>>> We did discuss in v2.
>>> Usually returning the whole blob and parsing is not desired via netlink.
>>> Returning individual fields give the full flexibility to return only the valid
>> fields.
>>> Otherwise we need to implement another bitmask too to tell which fields
>> from the struct are valid and share with user space.
>>> Returning individual fields is the widely used approach.
>>
>> The main concerns are:
>>
>> 1) The blob will be self contained if it was passed with the negotiated
>> features, so we don't need bitmask.
> Which fields of the struct are valid is told by additional fields.
>> 2) Using individual fields means it must duplicate the config fields of every
>> virtio devices
>>
> Mostly no. if there are common config fields across two device types, they would be named as
> VDPA_ATTR_DEV_CFG_*
> Net specific will be,
> VDPA_ATTR_DEV_NET_CFG_*
> Block specific, will be,
> VDPA_ATTR_DEV_BLK_CFG_*


I meant it looks like VDPA_ATTR_DEV_NET will duplicate all the fields of:

struct virtio_net_config;

And VDPA_ATTR_DEV_BLOCK will duplicate all the fields of

struct virtio_blk_config; which has ~21 fields.

And we had a plenty of other types of virtio devices.

Consider we had a mature set of virtio specific uAPI for config space. 
It would be a burden if we need an unnecessary translation layer of 
netlink in the middle:

[vDPA parent (virtio_net_config)] <-> [netlink (VDPA_ATTR_DEV_NET_XX)] 
<-> [userspace (VDPA_ATTR_DEV_NET_XX)] <-> [ user (virtio_net_config)]

If we make netlink simply a transport, it would be much easier. And we 
had the chance to unify the logic of build_config() and set_config() in 
the driver.


>
>> And actually, it's not the binary blob since uapi clearly define the format (e.g
>> struct virtio_net_config), can we find a way to use that? E.g introduce
>> device/net specific command and passing the blob with length and
>> negotiated features.
> Length may change in future, mostly expand. And parsing based on length is not such a clean way.


Length is only for legal checking. The config is self contained with:

1) device id
2) features


> Parsing fields require knowledge of features as well and application needs to make multiple netlink calls to parse the config space.


I think we don't care about the performance in this case. It's about 
three netlink calls:

1) get config
2) get device id
3) get features

For build config, it's only one

1) build config


> I prefer to follow rest of the kernel style to return self contained invidividual fields.


But I saw a lot of kernel codes choose to use e.g nla_put() directly 
with module specific structure.

Thanks


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-24  5:43           ` Jason Wang
@ 2021-06-24  6:29             ` Parav Pandit
  2021-06-24  7:05               ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-06-24  6:29 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: Eli Cohen, mst



> From: Jason Wang <jasowang@redhat.com>
> Sent: Thursday, June 24, 2021 11:13 AM
> 
> 在 2021/6/23 下午12:22, Parav Pandit 写道:
> >
> >> From: Jason Wang <jasowang@redhat.com>
> >> Sent: Wednesday, June 23, 2021 9:39 AM
> >>
> >> 在 2021/6/22 下午10:03, Parav Pandit 写道:
> >>>> Is it better to use a separate enum for net specific attributes?
> >>>>
> >>> Yes, because they are only net specific.
> >>> I guess it is related to your below question.
> >>>
> >>>> Another question (sorry if it has been asked before). Can we simply
> >>>> return the config (binary) to the userspace, then usespace can use
> >>>> the existing uAPI like virtio_net_config plus the feature to
> >>>> explain the
> >> config?
> >>> We did discuss in v2.
> >>> Usually returning the whole blob and parsing is not desired via netlink.
> >>> Returning individual fields give the full flexibility to return only
> >>> the valid
> >> fields.
> >>> Otherwise we need to implement another bitmask too to tell which
> >>> fields
> >> from the struct are valid and share with user space.
> >>> Returning individual fields is the widely used approach.
> >>
> >> The main concerns are:
> >>
> >> 1) The blob will be self contained if it was passed with the
> >> negotiated features, so we don't need bitmask.
> > Which fields of the struct are valid is told by additional fields.
> >> 2) Using individual fields means it must duplicate the config fields
> >> of every virtio devices
> >>
> > Mostly no. if there are common config fields across two device types,
> > they would be named as
> > VDPA_ATTR_DEV_CFG_*
> > Net specific will be,
> > VDPA_ATTR_DEV_NET_CFG_*
> > Block specific, will be,
> > VDPA_ATTR_DEV_BLK_CFG_*
> 
> 
> I meant it looks like VDPA_ATTR_DEV_NET will duplicate all the fields of:
> 
> struct virtio_net_config;
> 
> And VDPA_ATTR_DEV_BLOCK will duplicate all the fields of
> 
> struct virtio_blk_config; which has ~21 fields.
> 
> And we had a plenty of other types of virtio devices.
> 
> Consider we had a mature set of virtio specific uAPI for config space.
> It would be a burden if we need an unnecessary translation layer of netlink in
> the middle:
> 
> [vDPA parent (virtio_net_config)] <-> [netlink (VDPA_ATTR_DEV_NET_XX)]
> <-> [userspace (VDPA_ATTR_DEV_NET_XX)] 

>> <-> [ user (virtio_net_config)]
This translation is not there. We show relevant net config fields as VDPA_ATTR_DEV_NET individually.
It is not a binary dump which is harder for users to parse and make any use of it.

It is only one level of translation from virtio_net_config (kernel) -> netlink vdpa fields.
It is similar to 'struct netdevice' -> rtnl info fields.

> 
> If we make netlink simply a transport, it would be much easier. And we had
> the chance to unify the logic of build_config() and set_config() in the driver.
How? We need bit mask to tell that out of 21 fields which fields to update and which not.
And that is further mixed with offset and length.

> 
> 
> >
> >> And actually, it's not the binary blob since uapi clearly define the
> >> format (e.g struct virtio_net_config), can we find a way to use that?
> >> E.g introduce device/net specific command and passing the blob with
> >> length and negotiated features.
> > Length may change in future, mostly expand. And parsing based on length
> is not such a clean way.
> 
> 
> Length is only for legal checking. The config is self contained with:
> 
Unlikely. When structure size increases later, the parsing will change based on the length.
Because older kernel would return shorter length with older iproute2 tool.
So user space always have to deal and have nasty parsing/typecasting based on the length.

> 1) device id
> 2) features
> 
> 
> > Parsing fields require knowledge of features as well and application needs
> to make multiple netlink calls to parse the config space.
> 
> 
> I think we don't care about the performance in this case. It's about three
> netlink calls:
> 
Its not about performance. By the time 1st call is made, features got updated and it is out of sync with config.

> 1) get config
> 2) get device id
> 3) get features
> 
This requires using features from 3rd netlink output to decode output of 1st netlink output.
Which is a bit odd of netlink.
Other netlink nla_put() probably sending whole structure doesn’t need to do it.

> For build config, it's only one
> 
> 1) build config
> 
> 
> > I prefer to follow rest of the kernel style to return self contained
> invidividual fields.
> 
> 
> But I saw a lot of kernel codes choose to use e.g nla_put() directly with
> module specific structure.
> 
It might be self-contained structure that probably has not found the need to expand.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-24  6:29             ` Parav Pandit
@ 2021-06-24  7:05               ` Jason Wang
  2021-06-24  7:59                 ` Parav Pandit
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-06-24  7:05 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: Eli Cohen, mst


在 2021/6/24 下午2:29, Parav Pandit 写道:
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Thursday, June 24, 2021 11:13 AM
>>
>> 在 2021/6/23 下午12:22, Parav Pandit 写道:
>>>> From: Jason Wang <jasowang@redhat.com>
>>>> Sent: Wednesday, June 23, 2021 9:39 AM
>>>>
>>>> 在 2021/6/22 下午10:03, Parav Pandit 写道:
>>>>>> Is it better to use a separate enum for net specific attributes?
>>>>>>
>>>>> Yes, because they are only net specific.
>>>>> I guess it is related to your below question.
>>>>>
>>>>>> Another question (sorry if it has been asked before). Can we simply
>>>>>> return the config (binary) to the userspace, then usespace can use
>>>>>> the existing uAPI like virtio_net_config plus the feature to
>>>>>> explain the
>>>> config?
>>>>> We did discuss in v2.
>>>>> Usually returning the whole blob and parsing is not desired via netlink.
>>>>> Returning individual fields give the full flexibility to return only
>>>>> the valid
>>>> fields.
>>>>> Otherwise we need to implement another bitmask too to tell which
>>>>> fields
>>>> from the struct are valid and share with user space.
>>>>> Returning individual fields is the widely used approach.
>>>> The main concerns are:
>>>>
>>>> 1) The blob will be self contained if it was passed with the
>>>> negotiated features, so we don't need bitmask.
>>> Which fields of the struct are valid is told by additional fields.
>>>> 2) Using individual fields means it must duplicate the config fields
>>>> of every virtio devices
>>>>
>>> Mostly no. if there are common config fields across two device types,
>>> they would be named as
>>> VDPA_ATTR_DEV_CFG_*
>>> Net specific will be,
>>> VDPA_ATTR_DEV_NET_CFG_*
>>> Block specific, will be,
>>> VDPA_ATTR_DEV_BLK_CFG_*
>>
>> I meant it looks like VDPA_ATTR_DEV_NET will duplicate all the fields of:
>>
>> struct virtio_net_config;
>>
>> And VDPA_ATTR_DEV_BLOCK will duplicate all the fields of
>>
>> struct virtio_blk_config; which has ~21 fields.
>>
>> And we had a plenty of other types of virtio devices.
>>
>> Consider we had a mature set of virtio specific uAPI for config space.
>> It would be a burden if we need an unnecessary translation layer of netlink in
>> the middle:
>>
>> [vDPA parent (virtio_net_config)] <-> [netlink (VDPA_ATTR_DEV_NET_XX)]
>> <-> [userspace (VDPA_ATTR_DEV_NET_XX)]
>>> <-> [ user (virtio_net_config)]
> This translation is not there. We show relevant net config fields as VDPA_ATTR_DEV_NET individually.
> It is not a binary dump which is harder for users to parse and make any use of it.


The is done implicitly, user needs to understand the semantic of 
virtio_net_config and map the individual fields to the vdpa tool 
sub-command.


>
> It is only one level of translation from virtio_net_config (kernel) -> netlink vdpa fields.
> It is similar to 'struct netdevice' -> rtnl info fields.


I think not, the problem is that the netdevice is not a part of uAPI but 
virtio_net_config is.


>
>> If we make netlink simply a transport, it would be much easier. And we had
>> the chance to unify the logic of build_config() and set_config() in the driver.
> How? We need bit mask to tell that out of 21 fields which fields to update and which not.
> And that is further mixed with offset and length.


So set_config() could be called from userspace, so did build_config(). 
The only difference is:

1) they're using different transport, ioctl vs netlink
2) build_config() is only expected to be called by the management tool

If qemu works well via set_config ioctl, netlink should work as well.

Btw, what happens if management tool tries to modify the mac of vDPA 
when the device is already used by the driver?


>
>>
>>>> And actually, it's not the binary blob since uapi clearly define the
>>>> format (e.g struct virtio_net_config), can we find a way to use that?
>>>> E.g introduce device/net specific command and passing the blob with
>>>> length and negotiated features.
>>> Length may change in future, mostly expand. And parsing based on length
>> is not such a clean way.
>>
>>
>> Length is only for legal checking. The config is self contained with:
>>
> Unlikely. When structure size increases later, the parsing will change based on the length.
> Because older kernel would return shorter length with older iproute2 tool.


This is fine since the older kernel only support less features. The only 
possible issue if the old iproute 2 runs on new kernel. With the current 
proposal, it may cause some config fields can't not be showed.

I think it might be useful to introduce a command to simply dump the 
config space.


> So user space always have to deal and have nasty parsing/typecasting based on the length.


That's how userspace (Qemu) is expected to work now. The userspace 
should determine the semantic of the fields based on the features.

Differentiate config fields doesn't help much, e.g userspace still need 
to differ LINK_UP and ANNOUNCE for the status field.


>
>> 1) device id
>> 2) features
>>
>>
>>> Parsing fields require knowledge of features as well and application needs
>> to make multiple netlink calls to parse the config space.
>>
>>
>> I think we don't care about the performance in this case. It's about three
>> netlink calls:
>>
> Its not about performance. By the time 1st call is made, features got updated and it is out of sync with config.
>
>> 1) get config
>> 2) get device id
>> 3) get features
>>
> This requires using features from 3rd netlink output to decode output of 1st netlink output.
> Which is a bit odd of netlink.
> Other netlink nla_put() probably sending whole structure doesn’t need to do it.


Well, we can pack them all into a single skb isn't it? (probably with a 
config len).


>
>> For build config, it's only one
>>
>> 1) build config
>>
>>
>>> I prefer to follow rest of the kernel style to return self contained
>> invidividual fields.
>>
>>
>> But I saw a lot of kernel codes choose to use e.g nla_put() directly with
>> module specific structure.
>>
> It might be self-contained structure that probably has not found the need to expand.


I think it's just a matter of putting the config length with the config 
data. Note that we've already had .get_config_size() ops for validating 
inputs through VHOST_SET_CONFIG/VHOST_GET_CONFIG.

Thanks

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-24  7:05               ` Jason Wang
@ 2021-06-24  7:59                 ` Parav Pandit
  2021-06-25  3:28                   ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-06-24  7:59 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: Eli Cohen, mst



> From: Jason Wang <jasowang@redhat.com>
> Sent: Thursday, June 24, 2021 12:35 PM
> 
> >> Consider we had a mature set of virtio specific uAPI for config space.
> >> It would be a burden if we need an unnecessary translation layer of
> >> netlink in the middle:
> >>
> >> [vDPA parent (virtio_net_config)] <-> [netlink
> >> (VDPA_ATTR_DEV_NET_XX)] <-> [userspace
> (VDPA_ATTR_DEV_NET_XX)]
> >>> <-> [ user (virtio_net_config)]
> > This translation is not there. We show relevant net config fields as
> VDPA_ATTR_DEV_NET individually.
> > It is not a binary dump which is harder for users to parse and make any use
> of it.
> 
> 
> The is done implicitly, user needs to understand the semantic of
> virtio_net_config and map the individual fields to the vdpa tool sub-
> command.
Mostly not virtio_net_config is for the producer and consumer sw entities.
Here user doesn't know about such layout and where its located.
User only sets config params that gets set in the config space.
(without understanding what is config layout and its location).

> 
> 
> >
> > It is only one level of translation from virtio_net_config (kernel) -> netlink
> vdpa fields.
> > It is similar to 'struct netdevice' -> rtnl info fields.
> 
> 
> I think not, the problem is that the netdevice is not a part of uAPI but
> virtio_net_config is.
Virtio_net_config is a UAPI for sw consumption.
That way yes, netlink can also do it, however it requires side channel communicate what is valid.

> 
> 
> >
> >> If we make netlink simply a transport, it would be much easier. And we
> had
> >> the chance to unify the logic of build_config() and set_config() in the
> driver.
> > How? We need bit mask to tell that out of 21 fields which fields to update
> and which not.
> > And that is further mixed with offset and length.
> 
> 
> So set_config() could be called from userspace, so did build_config().
> The only difference is:
> 
> 1) they're using different transport, ioctl vs netlink
> 2) build_config() is only expected to be called by the management tool
> 
> If qemu works well via set_config ioctl, netlink should work as well.
> 
mlx5 set_config is noop.
vdpa_set_config() need to return an error code. I don't 
vp_vdpa.c blindly writes the config as its passthrough.
Parsing which fields to write and which not, using offset and length is a messy code with typecast and compare old values etc.

> Btw, what happens if management tool tries to modify the mac of vDPA
> when the device is already used by the driver?
At present it allows modifying, but it should be improved in future to fail if device is in use.

> >>>> And actually, it's not the binary blob since uapi clearly define the
> >>>> format (e.g struct virtio_net_config), can we find a way to use that?
> >>>> E.g introduce device/net specific command and passing the blob with
> >>>> length and negotiated features.
> >>> Length may change in future, mostly expand. And parsing based on
> length
> >> is not such a clean way.
> >>
> >>
> >> Length is only for legal checking. The config is self contained with:
> >>
> > Unlikely. When structure size increases later, the parsing will change based
> on the length.
> > Because older kernel would return shorter length with older iproute2 tool.
> 
> 
> This is fine since the older kernel only support less features. The only
> possible issue if the old iproute 2 runs on new kernel. With the current
> proposal, it may cause some config fields can't not be showed.
> 
Not showing is ok.
But the code is messy to typecast on size.

> I think it might be useful to introduce a command to simply dump the
> config space.
> 
> 
> > So user space always have to deal and have nasty parsing/typecasting
> based on the length.
Such nasty parsing is not required for netlink interface.

> 
> 
> That's how userspace (Qemu) is expected to work now. The userspace
> should determine the semantic of the fields based on the features.
> 
> Differentiate config fields doesn't help much, e.g userspace still need
> to differ LINK_UP and ANNOUNCE for the status field.
Yes, this parsing is from constant size u16 status.
> 
> 
[..]

>
> > Its not about performance. By the time 1st call is made, features got
> updated and it is out of sync with config.
> >
> >> 1) get config
> >> 2) get device id
> >> 3) get features
> >>
> > This requires using features from 3rd netlink output to decode output of
> 1st netlink output.
> > Which is a bit odd of netlink.
> > Other netlink nla_put() probably sending whole structure doesn’t need to
> do it.
> 
> 
> Well, we can pack them all into a single skb isn't it? (probably with a
> config len).
> 
You want to pack features and config both in the single nla_put()?
If so, it isn't necessary. There are more examples in kernel that adds individual fields instead of nla_put(blob).
I wouldn’t follow those nla_put() callers.

> 
> >
> >> For build config, it's only one
> >>
> >> 1) build config
> >>
> >>
> >>> I prefer to follow rest of the kernel style to return self contained
> >> invidividual fields.
> >>
> >>
> >> But I saw a lot of kernel codes choose to use e.g nla_put() directly with
> >> module specific structure.
> >>
> > It might be self-contained structure that probably has not found the need
> to expand.
> 
> 
> I think it's just a matter of putting the config length with the config
> data. Note that we've already had .get_config_size() ops for validating
> inputs through VHOST_SET_CONFIG/VHOST_GET_CONFIG.
This length comes as part of the netlink interface already, no need for extra length.
The whole point is to avoid parsing based on length.
We cannot change the virtio_net_config UAPI in use, but netlink code doesn’t need to be bound to size based typecasting and compare fields during build_config().
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-24  7:59                 ` Parav Pandit
@ 2021-06-25  3:28                   ` Jason Wang
  2021-06-25  6:45                     ` Parav Pandit
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-06-25  3:28 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: Eli Cohen, mst


在 2021/6/24 下午3:59, Parav Pandit 写道:
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Thursday, June 24, 2021 12:35 PM
>>
>>>> Consider we had a mature set of virtio specific uAPI for config space.
>>>> It would be a burden if we need an unnecessary translation layer of
>>>> netlink in the middle:
>>>>
>>>> [vDPA parent (virtio_net_config)] <-> [netlink
>>>> (VDPA_ATTR_DEV_NET_XX)] <-> [userspace
>> (VDPA_ATTR_DEV_NET_XX)]
>>>>> <-> [ user (virtio_net_config)]
>>> This translation is not there. We show relevant net config fields as
>> VDPA_ATTR_DEV_NET individually.
>>> It is not a binary dump which is harder for users to parse and make any use
>> of it.
>>
>>
>> The is done implicitly, user needs to understand the semantic of
>> virtio_net_config and map the individual fields to the vdpa tool sub-
>> command.
> Mostly not virtio_net_config is for the producer and consumer sw entities.
> Here user doesn't know about such layout and where its located.
> User only sets config params that gets set in the config space.
> (without understanding what is config layout and its location).
>
>>
>>> It is only one level of translation from virtio_net_config (kernel) -> netlink
>> vdpa fields.
>>> It is similar to 'struct netdevice' -> rtnl info fields.
>>
>> I think not, the problem is that the netdevice is not a part of uAPI but
>> virtio_net_config is.
> Virtio_net_config is a UAPI for sw consumption.
> That way yes, netlink can also do it, however it requires side channel communicate what is valid.
>
>>
>>>> If we make netlink simply a transport, it would be much easier. And we
>> had
>>>> the chance to unify the logic of build_config() and set_config() in the
>> driver.
>>> How? We need bit mask to tell that out of 21 fields which fields to update
>> and which not.
>>> And that is further mixed with offset and length.
>>
>> So set_config() could be called from userspace, so did build_config().
>> The only difference is:
>>
>> 1) they're using different transport, ioctl vs netlink
>> 2) build_config() is only expected to be called by the management tool
>>
>> If qemu works well via set_config ioctl, netlink should work as well.
>>
> mlx5 set_config is noop.
> vdpa_set_config() need to return an error code. I don't
> vp_vdpa.c blindly writes the config as its passthrough.
> Parsing which fields to write and which not, using offset and length is a messy code with typecast and compare old values etc.


I don't see why it needs typecast, virtio_net_config is also uABI, you 
can deference the fields directly.


>
>> Btw, what happens if management tool tries to modify the mac of vDPA
>> when the device is already used by the driver?
> At present it allows modifying, but it should be improved in future to fail if device is in use.


This is something we need to fix I think. Or if it's really useful to 
allowing the attributes to be modified after the device is created.

Why not simply allow the config to be built only at device creation?


>
>>>>>> And actually, it's not the binary blob since uapi clearly define the
>>>>>> format (e.g struct virtio_net_config), can we find a way to use that?
>>>>>> E.g introduce device/net specific command and passing the blob with
>>>>>> length and negotiated features.
>>>>> Length may change in future, mostly expand. And parsing based on
>> length
>>>> is not such a clean way.
>>>>
>>>>
>>>> Length is only for legal checking. The config is self contained with:
>>>>
>>> Unlikely. When structure size increases later, the parsing will change based
>> on the length.
>>> Because older kernel would return shorter length with older iproute2 tool.
>>
>> This is fine since the older kernel only support less features. The only
>> possible issue if the old iproute 2 runs on new kernel. With the current
>> proposal, it may cause some config fields can't not be showed.
>>
> Not showing is ok.
> But the code is messy to typecast on size.
>
>> I think it might be useful to introduce a command to simply dump the
>> config space.
>>
>>
>>> So user space always have to deal and have nasty parsing/typecasting
>> based on the length.
> Such nasty parsing is not required for netlink interface.
>
>>
>> That's how userspace (Qemu) is expected to work now. The userspace
>> should determine the semantic of the fields based on the features.
>>
>> Differentiate config fields doesn't help much, e.g userspace still need
>> to differ LINK_UP and ANNOUNCE for the status field.
> Yes, this parsing is from constant size u16 status.
>>
> [..]
>
>>> Its not about performance. By the time 1st call is made, features got
>> updated and it is out of sync with config.
>>>> 1) get config
>>>> 2) get device id
>>>> 3) get features
>>>>
>>> This requires using features from 3rd netlink output to decode output of
>> 1st netlink output.
>>> Which is a bit odd of netlink.
>>> Other netlink nla_put() probably sending whole structure doesn’t need to
>> do it.
>>
>>
>> Well, we can pack them all into a single skb isn't it? (probably with a
>> config len).
>>
> You want to pack features and config both in the single nla_put()?
> If so, it isn't necessary. There are more examples in kernel that adds individual fields instead of nla_put(blob).
> I wouldn’t follow those nla_put() callers.


No, a single skb not single nla_put().

Actually git grep told me a very good example of carrying uABI via 
netlink, that is the ndt_config:

1) we had ndt_config definition in the uAPI
2) netlink simply carries the structure in neightbl_fill_info():

                 if (nla_put(skb, NDTA_CONFIG, sizeof(ndc), &ndc))


For virito_net_config, why not simply:

len = ops->get_config_len();
config = kmalloc(len, GFP_KERNEL);
ops->get_config(vdev, 0, config, len);
nla_put(skb, VIRTIO_CONFIG, config, len);
nla_put_le64(skb, VIRTIO_FETURES, features);

For build_config, we can simply do thing reversely. Then everything 
works via the existing virtio uAPI/ABI.


>
>>>> For build config, it's only one
>>>>
>>>> 1) build config
>>>>
>>>>
>>>>> I prefer to follow rest of the kernel style to return self contained
>>>> invidividual fields.
>>>>
>>>>
>>>> But I saw a lot of kernel codes choose to use e.g nla_put() directly with
>>>> module specific structure.
>>>>
>>> It might be self-contained structure that probably has not found the need
>> to expand.
>>
>>
>> I think it's just a matter of putting the config length with the config
>> data. Note that we've already had .get_config_size() ops for validating
>> inputs through VHOST_SET_CONFIG/VHOST_GET_CONFIG.
> This length comes as part of the netlink interface already, no need for extra length.
> The whole point is to avoid parsing based on length.


Well, it doesn't do anything difference compared to xxx_is_valid which 
just calculating the offset implicitly (via the compiler).


> We cannot change the virtio_net_config UAPI in use, but netlink code doesn’t need to be bound to size based typecasting and compare fields during build_config().


The points are:

1) Avoid duplicating the existing uAPIs
2) Avoid unnecessary parsing in the netlink, netlink is just the 
transport, it's the charge of the vDPA parent to do that

Thanks


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-25  3:28                   ` Jason Wang
@ 2021-06-25  6:45                     ` Parav Pandit
  2021-06-28  5:03                       ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-06-25  6:45 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: Eli Cohen, mst



> From: Jason Wang <jasowang@redhat.com>
> Sent: Friday, June 25, 2021 8:59 AM
> 
> 在 2021/6/24 下午3:59, Parav Pandit 写道:
> >
> >> From: Jason Wang <jasowang@redhat.com>
> >> Sent: Thursday, June 24, 2021 12:35 PM
> >>
> >>>> Consider we had a mature set of virtio specific uAPI for config space.
> >>>> It would be a burden if we need an unnecessary translation layer of
> >>>> netlink in the middle:
> >>>>
> >>>> [vDPA parent (virtio_net_config)] <-> [netlink
> >>>> (VDPA_ATTR_DEV_NET_XX)] <-> [userspace
> >> (VDPA_ATTR_DEV_NET_XX)]
> >>>>> <-> [ user (virtio_net_config)]
> >>> This translation is not there. We show relevant net config fields as
> >> VDPA_ATTR_DEV_NET individually.
> >>> It is not a binary dump which is harder for users to parse and make
> >>> any use
> >> of it.
> >>
> >>
> >> The is done implicitly, user needs to understand the semantic of
> >> virtio_net_config and map the individual fields to the vdpa tool sub-
> >> command.
> > Mostly not virtio_net_config is for the producer and consumer sw entities.
> > Here user doesn't know about such layout and where its located.
> > User only sets config params that gets set in the config space.
> > (without understanding what is config layout and its location).
> >
> >>
> >>> It is only one level of translation from virtio_net_config (kernel)
> >>> -> netlink
> >> vdpa fields.
> >>> It is similar to 'struct netdevice' -> rtnl info fields.
> >>
> >> I think not, the problem is that the netdevice is not a part of uAPI
> >> but virtio_net_config is.
> > Virtio_net_config is a UAPI for sw consumption.
> > That way yes, netlink can also do it, however it requires side channel
> communicate what is valid.
> >
> >>
> >>>> If we make netlink simply a transport, it would be much easier. And
> >>>> we
> >> had
> >>>> the chance to unify the logic of build_config() and set_config() in
> >>>> the
> >> driver.
> >>> How? We need bit mask to tell that out of 21 fields which fields to
> >>> update
> >> and which not.
> >>> And that is further mixed with offset and length.
> >>
> >> So set_config() could be called from userspace, so did build_config().
> >> The only difference is:
> >>
> >> 1) they're using different transport, ioctl vs netlink
> >> 2) build_config() is only expected to be called by the management
> >> tool
> >>
> >> If qemu works well via set_config ioctl, netlink should work as well.
> >>
> > mlx5 set_config is noop.
> > vdpa_set_config() need to return an error code. I don't vp_vdpa.c
> > blindly writes the config as its passthrough.
> > Parsing which fields to write and which not, using offset and length is a
> messy code with typecast and compare old values etc.
> 
> 
> I don't see why it needs typecast, virtio_net_config is also uABI, you can
> deference the fields directly.
>
User wants set only the mac address of the config space. How do user space tell this?
Pass the whole virtio_net_config and inform via side channel?
Or vendor driver is expected to compare what fields changed from old config space?
 
> 
> >
> >> Btw, what happens if management tool tries to modify the mac of vDPA
> >> when the device is already used by the driver?
> > At present it allows modifying, but it should be improved in future to fail if
> device is in use.
> 
> 
> This is something we need to fix I think. Or if it's really useful to
> allowing the attributes to be modified after the device is created.
> 
> Why not simply allow the config to be built only at device creation?
>
That avoids the problem of modifying fields after bind to vhost.
But UAPI issue still remains so lets resolve that first.

> 
> >
> >>>>>> And actually, it's not the binary blob since uapi clearly define the
> >>>>>> format (e.g struct virtio_net_config), can we find a way to use that?
> >>>>>> E.g introduce device/net specific command and passing the blob with
> >>>>>> length and negotiated features.
> >>>>> Length may change in future, mostly expand. And parsing based on
> >> length
> >>>> is not such a clean way.
> >>>>
> >>>>
> >>>> Length is only for legal checking. The config is self contained with:
> >>>>
> >>> Unlikely. When structure size increases later, the parsing will change
> based
> >> on the length.
> >>> Because older kernel would return shorter length with older iproute2
> tool.
> >>
> >> This is fine since the older kernel only support less features. The only
> >> possible issue if the old iproute 2 runs on new kernel. With the current
> >> proposal, it may cause some config fields can't not be showed.
> >>
> > Not showing is ok.
> > But the code is messy to typecast on size.
> >
> >> I think it might be useful to introduce a command to simply dump the
> >> config space.
> >>
> >>
> >>> So user space always have to deal and have nasty parsing/typecasting
> >> based on the length.
> > Such nasty parsing is not required for netlink interface.
> >
> >>
> >> That's how userspace (Qemu) is expected to work now. The userspace
> >> should determine the semantic of the fields based on the features.
> >>
> >> Differentiate config fields doesn't help much, e.g userspace still need
> >> to differ LINK_UP and ANNOUNCE for the status field.
> > Yes, this parsing is from constant size u16 status.
> >>
> > [..]
> >
> >>> Its not about performance. By the time 1st call is made, features got
> >> updated and it is out of sync with config.
> >>>> 1) get config
> >>>> 2) get device id
> >>>> 3) get features
> >>>>
> >>> This requires using features from 3rd netlink output to decode output of
> >> 1st netlink output.
> >>> Which is a bit odd of netlink.
> >>> Other netlink nla_put() probably sending whole structure doesn’t need
> to
> >> do it.
> >>
> >>
> >> Well, we can pack them all into a single skb isn't it? (probably with a
> >> config len).
> >>
> > You want to pack features and config both in the single nla_put()?
> > If so, it isn't necessary. There are more examples in kernel that adds
> individual fields instead of nla_put(blob).
> > I wouldn’t follow those nla_put() callers.
> 
> 
> No, a single skb not single nla_put().
> 
> Actually git grep told me a very good example of carrying uABI via
> netlink, that is the ndt_config:
> 
> 1) we had ndt_config definition in the uAPI
> 2) netlink simply carries the structure in neightbl_fill_info():
> 
>                  if (nla_put(skb, NDTA_CONFIG, sizeof(ndc), &ndc))
> 
Sure. But the reverse path doesn’t have this that requires side band mask.
My concern is not for existing virtio_net_config layout, but the future increase of it requires size based typecasting on both directions.

> 
> For virito_net_config, why not simply:
> 
> len = ops->get_config_len();
> config = kmalloc(len, GFP_KERNEL);
> ops->get_config(vdev, 0, config, len);
> nla_put(skb, VIRTIO_CONFIG, config, len);
User space need to parse content based on this length as it can change in future.
Length telling how to typecast is want I want to avoid here.

> nla_put_le64(skb, VIRTIO_FETURES, features);
>
 
> For build_config, we can simply do thing reversely. Then everything
> works via the existing virtio uAPI/ABI.
>
In reverse path how do you tell which fields of the config space to set and which to ignore?
Shall we use u64 features for it?
Will type of device able to describe their config space via a feature bit?

> 
> >
> >>>> For build config, it's only one
> >>>>
> >>>> 1) build config
> >>>>
> >>>>
> >>>>> I prefer to follow rest of the kernel style to return self contained
> >>>> invidividual fields.
> >>>>
> >>>>
> >>>> But I saw a lot of kernel codes choose to use e.g nla_put() directly with
> >>>> module specific structure.
> >>>>
> >>> It might be self-contained structure that probably has not found the
> need
> >> to expand.
> >>
> >>
> >> I think it's just a matter of putting the config length with the config
> >> data. Note that we've already had .get_config_size() ops for validating
> >> inputs through VHOST_SET_CONFIG/VHOST_GET_CONFIG.
> > This length comes as part of the netlink interface already, no need for extra
> length.
> > The whole point is to avoid parsing based on length.
> 
> 
> Well, it doesn't do anything difference compared to xxx_is_valid which
> just calculating the offset implicitly (via the compiler).
> 
> 
> > We cannot change the virtio_net_config UAPI in use, but netlink code
> doesn’t need to be bound to size based typecasting and compare fields
> during build_config().
> 
> 
> The points are:
> 
> 1) Avoid duplicating the existing uAPIs
> 2) Avoid unnecessary parsing in the netlink, netlink is just the
> transport, it's the charge of the vDPA parent to do that
>
All those parsing will move to vendor drivers to validate offset/length to update only specific fields of config space.
It is a transport to carry fields which is what we are using for.
I agree there that these config fields are exposed individually in both directions to keep it safe from structure layout increments.
 
> Thanks
> 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-25  6:45                     ` Parav Pandit
@ 2021-06-28  5:03                       ` Jason Wang
  2021-06-28 10:56                         ` Parav Pandit
  2021-06-28 22:39                         ` Michael S. Tsirkin
  0 siblings, 2 replies; 62+ messages in thread
From: Jason Wang @ 2021-06-28  5:03 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: Eli Cohen, mst


在 2021/6/25 下午2:45, Parav Pandit 写道:
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Friday, June 25, 2021 8:59 AM
>>
>> 在 2021/6/24 下午3:59, Parav Pandit 写道:
>>>> From: Jason Wang <jasowang@redhat.com>
>>>> Sent: Thursday, June 24, 2021 12:35 PM
>>>>
>>>>>> Consider we had a mature set of virtio specific uAPI for config space.
>>>>>> It would be a burden if we need an unnecessary translation layer of
>>>>>> netlink in the middle:
>>>>>>
>>>>>> [vDPA parent (virtio_net_config)] <-> [netlink
>>>>>> (VDPA_ATTR_DEV_NET_XX)] <-> [userspace
>>>> (VDPA_ATTR_DEV_NET_XX)]
>>>>>>> <-> [ user (virtio_net_config)]
>>>>> This translation is not there. We show relevant net config fields as
>>>> VDPA_ATTR_DEV_NET individually.
>>>>> It is not a binary dump which is harder for users to parse and make
>>>>> any use
>>>> of it.
>>>>
>>>>
>>>> The is done implicitly, user needs to understand the semantic of
>>>> virtio_net_config and map the individual fields to the vdpa tool sub-
>>>> command.
>>> Mostly not virtio_net_config is for the producer and consumer sw entities.
>>> Here user doesn't know about such layout and where its located.
>>> User only sets config params that gets set in the config space.
>>> (without understanding what is config layout and its location).
>>>
>>>>> It is only one level of translation from virtio_net_config (kernel)
>>>>> -> netlink
>>>> vdpa fields.
>>>>> It is similar to 'struct netdevice' -> rtnl info fields.
>>>> I think not, the problem is that the netdevice is not a part of uAPI
>>>> but virtio_net_config is.
>>> Virtio_net_config is a UAPI for sw consumption.
>>> That way yes, netlink can also do it, however it requires side channel
>> communicate what is valid.
>>>>>> If we make netlink simply a transport, it would be much easier. And
>>>>>> we
>>>> had
>>>>>> the chance to unify the logic of build_config() and set_config() in
>>>>>> the
>>>> driver.
>>>>> How? We need bit mask to tell that out of 21 fields which fields to
>>>>> update
>>>> and which not.
>>>>> And that is further mixed with offset and length.
>>>> So set_config() could be called from userspace, so did build_config().
>>>> The only difference is:
>>>>
>>>> 1) they're using different transport, ioctl vs netlink
>>>> 2) build_config() is only expected to be called by the management
>>>> tool
>>>>
>>>> If qemu works well via set_config ioctl, netlink should work as well.
>>>>
>>> mlx5 set_config is noop.
>>> vdpa_set_config() need to return an error code. I don't vp_vdpa.c
>>> blindly writes the config as its passthrough.
>>> Parsing which fields to write and which not, using offset and length is a
>> messy code with typecast and compare old values etc.
>>
>>
>> I don't see why it needs typecast, virtio_net_config is also uABI, you can
>> deference the fields directly.
>>
> User wants set only the mac address of the config space. How do user space tell this?


Good question, but we need first answer:

"Do we allow userspace space to modify one specific field of all the 
config?"


> Pass the whole virtio_net_config and inform via side channel?


That could be a method.


> Or vendor driver is expected to compare what fields changed from old config space?


So I think we need solve them all, but netlink is probably the wrong 
layer, we need to solve them at virtio level and let netlink a transport 
for them virtio uAPI/ABI.

And we need to figure out if we want to allow the userspace to modify 
the config after the device is created. If not, simply build the 
virtio_net_config and pass it to the vDPA parent during device creation. 
If not, invent new uAPI at virtio level to passing the config fields. 
Virtio or vDPA core can provide the library to compare the difference.

My feeling is that, if we restrict to only support build the config 
during the creation, it would simply a lot of things. And I didn't 
notice a use case that we need to change the config fields in the middle 
via the management API/tool.


>   
>>>> Btw, what happens if management tool tries to modify the mac of vDPA
>>>> when the device is already used by the driver?
>>> At present it allows modifying, but it should be improved in future to fail if
>> device is in use.
>>
>>
>> This is something we need to fix I think. Or if it's really useful to
>> allowing the attributes to be modified after the device is created.
>>
>> Why not simply allow the config to be built only at device creation?
>>
> That avoids the problem of modifying fields after bind to vhost.
> But UAPI issue still remains so lets resolve that first.
>
>>>>>>>> And actually, it's not the binary blob since uapi clearly define the
>>>>>>>> format (e.g struct virtio_net_config), can we find a way to use that?
>>>>>>>> E.g introduce device/net specific command and passing the blob with
>>>>>>>> length and negotiated features.
>>>>>>> Length may change in future, mostly expand. And parsing based on
>>>> length
>>>>>> is not such a clean way.
>>>>>>
>>>>>>
>>>>>> Length is only for legal checking. The config is self contained with:
>>>>>>
>>>>> Unlikely. When structure size increases later, the parsing will change
>> based
>>>> on the length.
>>>>> Because older kernel would return shorter length with older iproute2
>> tool.
>>>> This is fine since the older kernel only support less features. The only
>>>> possible issue if the old iproute 2 runs on new kernel. With the current
>>>> proposal, it may cause some config fields can't not be showed.
>>>>
>>> Not showing is ok.
>>> But the code is messy to typecast on size.
>>>
>>>> I think it might be useful to introduce a command to simply dump the
>>>> config space.
>>>>
>>>>
>>>>> So user space always have to deal and have nasty parsing/typecasting
>>>> based on the length.
>>> Such nasty parsing is not required for netlink interface.
>>>
>>>> That's how userspace (Qemu) is expected to work now. The userspace
>>>> should determine the semantic of the fields based on the features.
>>>>
>>>> Differentiate config fields doesn't help much, e.g userspace still need
>>>> to differ LINK_UP and ANNOUNCE for the status field.
>>> Yes, this parsing is from constant size u16 status.
>>> [..]
>>>
>>>>> Its not about performance. By the time 1st call is made, features got
>>>> updated and it is out of sync with config.
>>>>>> 1) get config
>>>>>> 2) get device id
>>>>>> 3) get features
>>>>>>
>>>>> This requires using features from 3rd netlink output to decode output of
>>>> 1st netlink output.
>>>>> Which is a bit odd of netlink.
>>>>> Other netlink nla_put() probably sending whole structure doesn’t need
>> to
>>>> do it.
>>>>
>>>>
>>>> Well, we can pack them all into a single skb isn't it? (probably with a
>>>> config len).
>>>>
>>> You want to pack features and config both in the single nla_put()?
>>> If so, it isn't necessary. There are more examples in kernel that adds
>> individual fields instead of nla_put(blob).
>>> I wouldn’t follow those nla_put() callers.
>>
>> No, a single skb not single nla_put().
>>
>> Actually git grep told me a very good example of carrying uABI via
>> netlink, that is the ndt_config:
>>
>> 1) we had ndt_config definition in the uAPI
>> 2) netlink simply carries the structure in neightbl_fill_info():
>>
>>                   if (nla_put(skb, NDTA_CONFIG, sizeof(ndc), &ndc))
>>
> Sure. But the reverse path doesn’t have this that requires side band mask.
> My concern is not for existing virtio_net_config layout, but the future increase of it requires size based typecasting on both directions.
>
>> For virito_net_config, why not simply:
>>
>> len = ops->get_config_len();
>> config = kmalloc(len, GFP_KERNEL);
>> ops->get_config(vdev, 0, config, len);
>> nla_put(skb, VIRTIO_CONFIG, config, len);
> User space need to parse content based on this length as it can change in future.
> Length telling how to typecast is want I want to avoid here.


So there's no real difference, using xxx_is_valid, is just a implicit 
length checking as what is done via config_len:

if (a_is_valid) {
     /* dump a */
} else if (b_is_valid) {
     /* dump b */
}

vs.

if (length < offsetof(struct virtio_net_config, next field of a)) {
     /* dump a*/
}

Actually, Qemu has solved the similar issues via the uAPI:

https://git.qemu.org/?p=qemu.git;a=blob;f=hw/net/virtio-net.c;h=bd7958b9f0eed2705e0d6a2feaeaefb5e63bd6a4;hb=HEAD#l92

If the current uAPI is not sufficient, let's tweak it.


>
>> nla_put_le64(skb, VIRTIO_FETURES, features);
>>
>   
>> For build_config, we can simply do thing reversely. Then everything
>> works via the existing virtio uAPI/ABI.
>>
> In reverse path how do you tell which fields of the config space to set and which to ignore?


See my above reply.


> Shall we use u64 features for it?
> Will type of device able to describe their config space via a feature bit?


I think not. They're a lot of fields can not be deduced from the 
features (mtu, queue paris, mac etc).

But I agree the the config fields can not work without the feature bits.


>
>>>>>> For build config, it's only one
>>>>>>
>>>>>> 1) build config
>>>>>>
>>>>>>
>>>>>>> I prefer to follow rest of the kernel style to return self contained
>>>>>> invidividual fields.
>>>>>>
>>>>>>
>>>>>> But I saw a lot of kernel codes choose to use e.g nla_put() directly with
>>>>>> module specific structure.
>>>>>>
>>>>> It might be self-contained structure that probably has not found the
>> need
>>>> to expand.
>>>>
>>>>
>>>> I think it's just a matter of putting the config length with the config
>>>> data. Note that we've already had .get_config_size() ops for validating
>>>> inputs through VHOST_SET_CONFIG/VHOST_GET_CONFIG.
>>> This length comes as part of the netlink interface already, no need for extra
>> length.
>>> The whole point is to avoid parsing based on length.
>>
>> Well, it doesn't do anything difference compared to xxx_is_valid which
>> just calculating the offset implicitly (via the compiler).
>>
>>
>>> We cannot change the virtio_net_config UAPI in use, but netlink code
>> doesn’t need to be bound to size based typecasting and compare fields
>> during build_config().
>>
>>
>> The points are:
>>
>> 1) Avoid duplicating the existing uAPIs
>> 2) Avoid unnecessary parsing in the netlink, netlink is just the
>> transport, it's the charge of the vDPA parent to do that
>>
> All those parsing will move to vendor drivers to validate offset/length to update only specific fields of config space.


Or the vDPA or virtio core can provide helpers to compare the difference 
if it's necessary.

Thanks


> It is a transport to carry fields which is what we are using for.
> I agree there that these config fields are exposed individually in both directions to keep it safe from structure layout increments.
>   
>> Thanks
>>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-28  5:03                       ` Jason Wang
@ 2021-06-28 10:56                         ` Parav Pandit
  2021-06-29  3:52                           ` Jason Wang
  2021-06-28 22:39                         ` Michael S. Tsirkin
  1 sibling, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-06-28 10:56 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: Eli Cohen, mst


> From: Jason Wang <jasowang@redhat.com>
> Sent: Monday, June 28, 2021 10:33 AM
> 
[..]

> >>
> >> I don't see why it needs typecast, virtio_net_config is also uABI,
> >> you can deference the fields directly.
> >>
> > User wants set only the mac address of the config space. How do user
> space tell this?
> 
> 
> Good question, but we need first answer:
> 
> "Do we allow userspace space to modify one specific field of all the config?"
> 
Even if we restrict to specify config params at creation time, question still remains open how to pass, either as whole struct + side_based info or as individual fields.
More below.

> 
> > Pass the whole virtio_net_config and inform via side channel?
> 
> 
> That could be a method.
I prefer the method to pass individual fields which has the clean code approach and full flexibility.
Clean code = 
1. no typecasting based on length
2. self-describing fields, do not depends on feature bits parsing
3. proof against structure size increases in fully backward/forward compatibility without code changes

> 
> 
> > Or vendor driver is expected to compare what fields changed from old
> config space?
> 
> 
> So I think we need solve them all, but netlink is probably the wrong
> layer, we need to solve them at virtio level and let netlink a transport
> for them virtio uAPI/ABI.
In spirit of using the virtio UAPI structure, we creating other side band fields, that results into code that’s not common to netlink method.
Ioctl() interface of QEMU/vhost didn't have any other choice with ioctl().

> 
> And we need to figure out if we want to allow the userspace to modify
> the config after the device is created. If not, simply build the
> virtio_net_config and pass it to the vDPA parent during device creation.
I like this idea to pass fields at creation time.

> If not, invent new uAPI at virtio level to passing the config fields.
> Virtio or vDPA core can provide the library to compare the difference.
> 

> My feeling is that, if we restrict to only support build the config
> during the creation, it would simply a lot of things. And I didn't
> notice a use case that we need to change the config fields in the middle
> via the management API/tool.
> 
Sure yes. Whichever config fields user wants to pass, user space passes it.

> >> For virito_net_config, why not simply:
> >>
> >> len = ops->get_config_len();
> >> config = kmalloc(len, GFP_KERNEL);
> >> ops->get_config(vdev, 0, config, len);
> >> nla_put(skb, VIRTIO_CONFIG, config, len);
> > User space need to parse content based on this length as it can change in
> future.
> > Length telling how to typecast is want I want to avoid here.
> 
> 
> So there's no real difference, using xxx_is_valid, is just a implicit
> length checking as what is done via config_len:
> 
> if (a_is_valid) {
>      /* dump a */
> } else if (b_is_valid) {
>      /* dump b */
> }
> 
> vs.
> 
> if (length < offsetof(struct virtio_net_config, next field of a)) {
>      /* dump a*/
+ the feature parsing code, for each field.

> }
> 
> Actually, Qemu has solved the similar issues via the uAPI:
> 
> https://git.qemu.org/?p=qemu.git;a=blob;f=hw/net/virtio-
> net.c;h=bd7958b9f0eed2705e0d6a2feaeaefb5e63bd6a4;hb=HEAD#l92
> 
> If the current uAPI is not sufficient, let's tweak it.
I am unable to convince my self to build side bitmask for config fields, type casting code in spirit of using existing structure UAPI.
This creates messy code for future.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-28  5:03                       ` Jason Wang
  2021-06-28 10:56                         ` Parav Pandit
@ 2021-06-28 22:39                         ` Michael S. Tsirkin
  2021-06-29  3:41                           ` Jason Wang
  1 sibling, 1 reply; 62+ messages in thread
From: Michael S. Tsirkin @ 2021-06-28 22:39 UTC (permalink / raw)
  To: Jason Wang; +Cc: Eli Cohen, virtualization

On Mon, Jun 28, 2021 at 01:03:20PM +0800, Jason Wang wrote:
> So I think we need solve them all, but netlink is probably the wrong layer,
> we need to solve them at virtio level and let netlink a transport for them
> virtio uAPI/ABI.

I'm not sure I follow. virtio defines VF to driver communication.
This is PF to hypervisor. virtio simply does not cover it ATM.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-28 22:39                         ` Michael S. Tsirkin
@ 2021-06-29  3:41                           ` Jason Wang
  2021-06-29 20:01                             ` Michael S. Tsirkin
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-06-29  3:41 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Eli Cohen, virtualization


在 2021/6/29 上午6:39, Michael S. Tsirkin 写道:
> On Mon, Jun 28, 2021 at 01:03:20PM +0800, Jason Wang wrote:
>> So I think we need solve them all, but netlink is probably the wrong layer,
>> we need to solve them at virtio level and let netlink a transport for them
>> virtio uAPI/ABI.
> I'm not sure I follow. virtio defines VF to driver communication.
> This is PF to hypervisor. virtio simply does not cover it ATM.


Note that this is not PF to hypervisor but the uAPI from userspace (vDPA 
tool) to vDPA core.

We had two choices.

1) tweak virtio uAPIs
2) invent virtio specific uAPI in netlink

1) seems better.

Thanks


>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-28 10:56                         ` Parav Pandit
@ 2021-06-29  3:52                           ` Jason Wang
  2021-06-29  9:49                             ` Parav Pandit
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-06-29  3:52 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: Eli Cohen, mst


在 2021/6/28 下午6:56, Parav Pandit 写道:
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Monday, June 28, 2021 10:33 AM
>>
> [..]
>
>>>> I don't see why it needs typecast, virtio_net_config is also uABI,
>>>> you can deference the fields directly.
>>>>
>>> User wants set only the mac address of the config space. How do user
>> space tell this?
>>
>>
>> Good question, but we need first answer:
>>
>> "Do we allow userspace space to modify one specific field of all the config?"
>>
> Even if we restrict to specify config params at creation time, question still remains open how to pass, either as whole struct + side_based info or as individual fields.
> More below.


Right.


>
>>> Pass the whole virtio_net_config and inform via side channel?
>>
>> That could be a method.
> I prefer the method to pass individual fields which has the clean code approach and full flexibility.
> Clean code =
> 1. no typecasting based on length
> 2. self-describing fields, do not depends on feature bits parsing
> 3. proof against structure size increases in fully backward/forward compatibility without code changes


So I think I agree. But I think we'd better to that in the virito uAPI 
(include/uapi/linux/virito_xxx.h)


>>
>>> Or vendor driver is expected to compare what fields changed from old
>> config space?
>>
>>
>> So I think we need solve them all, but netlink is probably the wrong
>> layer, we need to solve them at virtio level and let netlink a transport
>> for them virtio uAPI/ABI.
> In spirit of using the virtio UAPI structure, we creating other side band fields, that results into code that’s not common to netlink method.


I think maybe we can start from inventing new virtio uAPI and see if it 
has some contradict with netlink. Or maybe you can give me some example?


> Ioctl() interface of QEMU/vhost didn't have any other choice with ioctl().
>
>> And we need to figure out if we want to allow the userspace to modify
>> the config after the device is created. If not, simply build the
>> virtio_net_config and pass it to the vDPA parent during device creation.
> I like this idea to pass fields at creation time.
>
>> If not, invent new uAPI at virtio level to passing the config fields.
>> Virtio or vDPA core can provide the library to compare the difference.
>>
>> My feeling is that, if we restrict to only support build the config
>> during the creation, it would simply a lot of things. And I didn't
>> notice a use case that we need to change the config fields in the middle
>> via the management API/tool.
>>
> Sure yes. Whichever config fields user wants to pass, user space passes it.
>
>>>> For virito_net_config, why not simply:
>>>>
>>>> len = ops->get_config_len();
>>>> config = kmalloc(len, GFP_KERNEL);
>>>> ops->get_config(vdev, 0, config, len);
>>>> nla_put(skb, VIRTIO_CONFIG, config, len);
>>> User space need to parse content based on this length as it can change in
>> future.
>>> Length telling how to typecast is want I want to avoid here.
>>
>> So there's no real difference, using xxx_is_valid, is just a implicit
>> length checking as what is done via config_len:
>>
>> if (a_is_valid) {
>>       /* dump a */
>> } else if (b_is_valid) {
>>       /* dump b */
>> }
>>
>> vs.
>>
>> if (length < offsetof(struct virtio_net_config, next field of a)) {
>>       /* dump a*/
> + the feature parsing code, for each field.
>
>> }
>>
>> Actually, Qemu has solved the similar issues via the uAPI:
>>
>> https://git.qemu.org/?p=qemu.git;a=blob;f=hw/net/virtio-
>> net.c;h=bd7958b9f0eed2705e0d6a2feaeaefb5e63bd6a4;hb=HEAD#l92
>>
>> If the current uAPI is not sufficient, let's tweak it.
> I am unable to convince my self to build side bitmask for config fields, type casting code in spirit of using existing structure UAPI.
> This creates messy code for future.


Just a quick thought, how about simply something like:

struct virtio_net_config_build {
         __u8 mac[ETH_ALEN];
         __virtio16 max_virtqueue_pairs;
         __virtio16 reserved[3];
};

It looks to we don't need the rest of fields in the virtio_net_config to 
build the config since they are all hardware attributes.

So it looks self-contained and can be transported via netlink.

Thanks


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-29  3:52                           ` Jason Wang
@ 2021-06-29  9:49                             ` Parav Pandit
  2021-06-30  4:31                               ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-06-29  9:49 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: Eli Cohen, mst

Hi Jason,

> From: Jason Wang <jasowang@redhat.com>
> Sent: Tuesday, June 29, 2021 9:22 AM


> >>> Pass the whole virtio_net_config and inform via side channel?
> >>
> >> That could be a method.
> > I prefer the method to pass individual fields which has the clean code
> approach and full flexibility.
> > Clean code =
> > 1. no typecasting based on length
> > 2. self-describing fields, do not depends on feature bits parsing 3.
> > proof against structure size increases in fully backward/forward
> > compatibility without code changes
> 
> 
> So I think I agree. But I think we'd better to that in the virito uAPI
> (include/uapi/linux/virito_xxx.h)
> 

[..]

> 
> I think maybe we can start from inventing new virtio uAPI and see if it
> has some contradict with netlink. Or maybe you can give me some example?
> 
> 

> > I am unable to convince my self to build side bitmask for config fields, type
> casting code in spirit of using existing structure UAPI.
> > This creates messy code for future.
> 
> 
> Just a quick thought, how about simply something like:
> 
> struct virtio_net_config_build {
>          __u8 mac[ETH_ALEN];
>          __virtio16 max_virtqueue_pairs;
>          __virtio16 reserved[3];
> };
In this structure we need to add bi field flags to indicate which entry is valid.
And when structure layout changes, we end up with similar typecast issues, length checks and more.
Most of it is inbuild to the netlink.

So I propose,
(a) we pass config parameters during vdpa device create
$ vdpa dev add name foo mgmtdev pci/0000:03:00.4 mac 00:11:22:33:44:55 maxq 10

This results in adding two onenew netlink optional attributes as VDPA_DEV_NET_MAC.
VDPA_ATTR_DEV_MAX_VQ_SIZE is already dfined for max queues.
NLA_POLICY_ETH_ADDR takes care to validate length size when passed.

> It looks to we don't need the rest of fields in the virtio_net_config to
> build the config since they are all hardware attributes.
Today it is only mac and max queues. Later on we may need to define rss hashing as hw/device advances.
And structure size will change.
Hence, I propose to have each as individual attribute that doesn’t need to cast in struct.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-29  3:41                           ` Jason Wang
@ 2021-06-29 20:01                             ` Michael S. Tsirkin
  2021-06-30  3:46                               ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Michael S. Tsirkin @ 2021-06-29 20:01 UTC (permalink / raw)
  To: Jason Wang; +Cc: Eli Cohen, virtualization

On Tue, Jun 29, 2021 at 11:41:54AM +0800, Jason Wang wrote:
> 
> 在 2021/6/29 上午6:39, Michael S. Tsirkin 写道:
> > On Mon, Jun 28, 2021 at 01:03:20PM +0800, Jason Wang wrote:
> > > So I think we need solve them all, but netlink is probably the wrong layer,
> > > we need to solve them at virtio level and let netlink a transport for them
> > > virtio uAPI/ABI.
> > I'm not sure I follow. virtio defines VF to driver communication.
> > This is PF to hypervisor. virtio simply does not cover it ATM.
> 
> 
> Note that this is not PF to hypervisor but the uAPI from userspace (vDPA
> tool) to vDPA core.
> 
> We had two choices.
> 
> 1) tweak virtio uAPIs
> 2) invent virtio specific uAPI in netlink
> 
> 1) seems better.
> 
> Thanks
> 

Well things like setting mac aren't virtio specific.
What are the virtio specific things you have in mind?

> > 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-29 20:01                             ` Michael S. Tsirkin
@ 2021-06-30  3:46                               ` Jason Wang
  0 siblings, 0 replies; 62+ messages in thread
From: Jason Wang @ 2021-06-30  3:46 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Eli Cohen, virtualization


在 2021/6/30 上午4:01, Michael S. Tsirkin 写道:
> On Tue, Jun 29, 2021 at 11:41:54AM +0800, Jason Wang wrote:
>> 在 2021/6/29 上午6:39, Michael S. Tsirkin 写道:
>>> On Mon, Jun 28, 2021 at 01:03:20PM +0800, Jason Wang wrote:
>>>> So I think we need solve them all, but netlink is probably the wrong layer,
>>>> we need to solve them at virtio level and let netlink a transport for them
>>>> virtio uAPI/ABI.
>>> I'm not sure I follow. virtio defines VF to driver communication.
>>> This is PF to hypervisor. virtio simply does not cover it ATM.
>>
>> Note that this is not PF to hypervisor but the uAPI from userspace (vDPA
>> tool) to vDPA core.
>>
>> We had two choices.
>>
>> 1) tweak virtio uAPIs
>> 2) invent virtio specific uAPI in netlink
>>
>> 1) seems better.
>>
>> Thanks
>>
> Well things like setting mac aren't virtio specific.
> What are the virtio specific things you have in mind?


E.g max_virtqueue_pairs? Even if mac aren't virtio specific, the idea is 
to build the device config which is virtio specific.

And we may have other devices like virtio-blk.

Thanks


>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-29  9:49                             ` Parav Pandit
@ 2021-06-30  4:31                               ` Jason Wang
  2021-06-30  6:03                                 ` Parav Pandit
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-06-30  4:31 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: Eli Cohen, mst


在 2021/6/29 下午5:49, Parav Pandit 写道:
> Hi Jason,
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Tuesday, June 29, 2021 9:22 AM
>
>>>>> Pass the whole virtio_net_config and inform via side channel?
>>>> That could be a method.
>>> I prefer the method to pass individual fields which has the clean code
>> approach and full flexibility.
>>> Clean code =
>>> 1. no typecasting based on length
>>> 2. self-describing fields, do not depends on feature bits parsing 3.
>>> proof against structure size increases in fully backward/forward
>>> compatibility without code changes
>>
>> So I think I agree. But I think we'd better to that in the virito uAPI
>> (include/uapi/linux/virito_xxx.h)
>>
> [..]
>
>> I think maybe we can start from inventing new virtio uAPI and see if it
>> has some contradict with netlink. Or maybe you can give me some example?
>>
>>
>>> I am unable to convince my self to build side bitmask for config fields, type
>> casting code in spirit of using existing structure UAPI.
>>> This creates messy code for future.
>>
>> Just a quick thought, how about simply something like:
>>
>> struct virtio_net_config_build {
>>           __u8 mac[ETH_ALEN];
>>           __virtio16 max_virtqueue_pairs;
>>           __virtio16 reserved[3];
>> };
> In this structure we need to add bi field flags to indicate which entry is valid.
> And when structure layout changes, we end up with similar typecast issues, length checks and more.
> Most of it is inbuild to the netlink.
>
> So I propose,
> (a) we pass config parameters during vdpa device create
> $ vdpa dev add name foo mgmtdev pci/0000:03:00.4 mac 00:11:22:33:44:55 maxq 10
>
> This results in adding two onenew netlink optional attributes as VDPA_DEV_NET_MAC.
> VDPA_ATTR_DEV_MAX_VQ_SIZE is already dfined for max queues.
> NLA_POLICY_ETH_ADDR takes care to validate length size when passed.
>
>> It looks to we don't need the rest of fields in the virtio_net_config to
>> build the config since they are all hardware attributes.
> Today it is only mac and max queues. Later on we may need to define rss hashing as hw/device advances.
> And structure size will change.
> Hence, I propose to have each as individual attribute that doesn’t need to cast in struct.


Ok, that should work. If Michael are fine with this, I'm also fine.

Just to clarify, if I understand this correctly, with the individual 
attribute, there's no need for the bit like xxx_is_valid?

Thanks


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-30  4:31                               ` Jason Wang
@ 2021-06-30  6:03                                 ` Parav Pandit
  2021-07-01  3:34                                   ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-06-30  6:03 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: Eli Cohen, mst

Hi Jason,

> From: Jason Wang <jasowang@redhat.com>
> Sent: Wednesday, June 30, 2021 10:02 AM

> >> It looks to we don't need the rest of fields in the virtio_net_config
> >> to build the config since they are all hardware attributes.
> > Today it is only mac and max queues. Later on we may need to define rss
> hashing as hw/device advances.
> > And structure size will change.
> > Hence, I propose to have each as individual attribute that doesn’t need to
> cast in struct.
> 
> 
> Ok, that should work. If Michael are fine with this, I'm also fine.
> 
> Just to clarify, if I understand this correctly, with the individual attribute,
> there's no need for the bit like xxx_is_valid?

xxx_is_valid is not present in the get calls.
It is also not present in UAPI set calls.
It is not a UAPI.
It is an internal between vdpa.c and vendor driver to tell which fields to use as there are optional.
If we want to get rid of those valid flags below code will move to vendor driver where we pass nl_attr, during device add callback.


+	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]) {
+		macaddr = nla_data(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]);
+		memcpy(config.net.mac, macaddr, sizeof(config.net.mac));
+		config.net_mask.mac_valid = true;
+	}
+	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]) {
+		config.net.mtu =
+			nla_get_u16(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]);
+		config.net_mask.mtu_valid = true;
+	}
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-06-30  6:03                                 ` Parav Pandit
@ 2021-07-01  3:34                                   ` Jason Wang
  2021-07-01  7:00                                     ` Parav Pandit
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-07-01  3:34 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: Eli Cohen, mst


在 2021/6/30 下午2:03, Parav Pandit 写道:
> Hi Jason,
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Wednesday, June 30, 2021 10:02 AM
>>>> It looks to we don't need the rest of fields in the virtio_net_config
>>>> to build the config since they are all hardware attributes.
>>> Today it is only mac and max queues. Later on we may need to define rss
>> hashing as hw/device advances.
>>> And structure size will change.
>>> Hence, I propose to have each as individual attribute that doesn’t need to
>> cast in struct.
>>
>>
>> Ok, that should work. If Michael are fine with this, I'm also fine.
>>
>> Just to clarify, if I understand this correctly, with the individual attribute,
>> there's no need for the bit like xxx_is_valid?
> xxx_is_valid is not present in the get calls.
> It is also not present in UAPI set calls.
> It is not a UAPI.
> It is an internal between vdpa.c and vendor driver to tell which fields to use as there are optional.
> If we want to get rid of those valid flags below code will move to vendor driver where we pass nl_attr, during device add callback.
>
>
> +	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]) {
> +		macaddr = nla_data(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]);
> +		memcpy(config.net.mac, macaddr, sizeof(config.net.mac));
> +		config.net_mask.mac_valid = true;
> +	}
> +	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]) {
> +		config.net.mtu =
> +			nla_get_u16(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]);
> +		config.net_mask.mtu_valid = true;
> +	}


Have a hard thought on this. I still think re-invent (duplicate) the 
virtio-net config filed is not a good choice (e.g for block we need to 
duplicate more than 20 attributes).

We may meet similar issue when provision VF/SF instance at the hardware 
level. So I think we may need something in the virtio spec in the near 
future.

So assuming we don't want a single attributes to be modified and we want 
to let user to specify all the attributes at one time during creation.

Maybe we can tweak virtio_net_config_set a little bit:

struct virtio_net_config_set {
         __virtio64 features;
         __u8 mac[ETH_ALEN];
         __virtio16 max_virtqueue_pairs;
         __virtio16 mtu;
         __virtio16 reserved[62];
}

So we have:

- both features and config fields, we're self contained
- reserved fields which should be sufficient for the next 10 years, so 
we don't need to care about the growing.

Or actually it also allows per field modification.

E.g if we don't specify VIRTIO_NET_F_MAC, it means mac field is invalid. 
So did for qps and mtu.

The advantage is that we can standardize this in the virtio spec which 
could be used for SF/VF provisioning.

For get, we probably need more work:

struct virtio_net_config_get {
         __virtio64 features;
         union {
                 struct virtio_net_config;
                 __virtio64 reserved[16];
         }
}

Or just follow how it is work today, simply pass the config plus the 
device_features.

Thanks


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-07-01  3:34                                   ` Jason Wang
@ 2021-07-01  7:00                                     ` Parav Pandit
  2021-07-01  7:43                                       ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-07-01  7:00 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: Eli Cohen, mst


> From: Jason Wang <jasowang@redhat.com>
> Sent: Thursday, July 1, 2021 9:04 AM


> >> Just to clarify, if I understand this correctly, with the individual
> >> attribute, there's no need for the bit like xxx_is_valid?
> > xxx_is_valid is not present in the get calls.
> > It is also not present in UAPI set calls.
> > It is not a UAPI.
> > It is an internal between vdpa.c and vendor driver to tell which fields to use
> as there are optional.
> > If we want to get rid of those valid flags below code will move to vendor
> driver where we pass nl_attr, during device add callback.
> >
> >
> > +	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]) {
> > +		macaddr =
> nla_data(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]);
> > +		memcpy(config.net.mac, macaddr, sizeof(config.net.mac));
> > +		config.net_mask.mac_valid = true;
> > +	}
> > +	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]) {
> > +		config.net.mtu =
> > +
> 	nla_get_u16(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]);
> > +		config.net_mask.mtu_valid = true;
> > +	}
> 
> 
> Have a hard thought on this. I still think re-invent (duplicate) the virtio-net
> config filed is not a good choice (e.g for block we need to duplicate more
> than 20 attributes).
We are re-inventing by defining a new structure below.
Instead of doing them as individual netlink attributes, its lumped together in a struct of arbitrary length. :-)

I notice several fields of the vduse device is setup via ioctl, which I think should be setup via this vdpa device add interface.

Also we can always wrap above nl_attr code in a helper API so that drivers to not hand-code it.

> 
> We may meet similar issue when provision VF/SF instance at the hardware
> level. So I think we may need something in the virtio spec in the near future.
Do you mean in a virtio vf and virtio sf?
If so, probably yes.
Given that we have the ability to transport individual fields, we don't need to attach the U->K UAPI to a undefined and evolving structure.

> 
> So assuming we don't want a single attributes to be modified and we want to
> let user to specify all the attributes at one time during creation.
> 
> Maybe we can tweak virtio_net_config_set a little bit:
> 
> struct virtio_net_config_set {
>          __virtio64 features;
>          __u8 mac[ETH_ALEN];
>          __virtio16 max_virtqueue_pairs;
>          __virtio16 mtu;
>          __virtio16 reserved[62];
> }
> 
> So we have:
> 
> - both features and config fields, we're self contained
> - reserved fields which should be sufficient for the next 10 years, so we don't
> need to care about the growing.
This is the reverse of netlink which offers to not reserve any arbitrary size structure. Though I agree that it may not grow.

> 
> Or actually it also allows per field modification.
> 
> E.g if we don't specify VIRTIO_NET_F_MAC, it means mac field is invalid.
> So did for qps and mtu.
> 
> The advantage is that we can standardize this in the virtio spec which could
> be used for SF/VF provisioning.
Virtio spec can be still standardized about which fields of config space should be setup.
To do so, we don't need to lump them in one structure.

> 
> For get, we probably need more work:
> 
> struct virtio_net_config_get {
>          __virtio64 features;
>          union {
>                  struct virtio_net_config;
>                  __virtio64 reserved[16];
>          }
> }
> 
> Or just follow how it is work today, simply pass the config plus the
> device_features.

If we go with individual attribute get and add both sorted out neatly, expandable.

You already explained that there isn't one to one mapping of features to config fields for other device types too.
Netlink already enables us to avoid non symmetric u64 reserved[16] in get and u16 reserved[16] in set.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-07-01  7:00                                     ` Parav Pandit
@ 2021-07-01  7:43                                       ` Jason Wang
  2021-07-02  6:04                                         ` Parav Pandit
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-07-01  7:43 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: Eli Cohen, mst


在 2021/7/1 下午3:00, Parav Pandit 写道:
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Thursday, July 1, 2021 9:04 AM
>
>>>> Just to clarify, if I understand this correctly, with the individual
>>>> attribute, there's no need for the bit like xxx_is_valid?
>>> xxx_is_valid is not present in the get calls.
>>> It is also not present in UAPI set calls.
>>> It is not a UAPI.
>>> It is an internal between vdpa.c and vendor driver to tell which fields to use
>> as there are optional.
>>> If we want to get rid of those valid flags below code will move to vendor
>> driver where we pass nl_attr, during device add callback.
>>>
>>> +	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]) {
>>> +		macaddr =
>> nla_data(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR]);
>>> +		memcpy(config.net.mac, macaddr, sizeof(config.net.mac));
>>> +		config.net_mask.mac_valid = true;
>>> +	}
>>> +	if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]) {
>>> +		config.net.mtu =
>>> +
>> 	nla_get_u16(nl_attrs[VDPA_ATTR_DEV_NET_CFG_MTU]);
>>> +		config.net_mask.mtu_valid = true;
>>> +	}
>>
>> Have a hard thought on this. I still think re-invent (duplicate) the virtio-net
>> config filed is not a good choice (e.g for block we need to duplicate more
>> than 20 attributes).
> We are re-inventing by defining a new structure below.


Actually it depends on what attributes is required for building the config.

We can simply reuse the existing virtio_net_config, if most of the 
fields are required.

struct virtio_net_config_set {
         __virtio64 features;
         union {
             struct virtio_net_config;
             __virtio64 reserved[64];
         }
};

If only few of the is required, we can just pick them and use another 
structure.

Actually, I think just pass the whole config with the device_features 
during device creation is a good choice that can simplify a lot of things.

We can define what is needed and ignore the others in the virtio spec. 
Then there's no need to worry about any other things. vDPA core can just 
do santiy test like checking size vs features.


> Instead of doing them as individual netlink attributes, its lumped together in a struct of arbitrary length. :-)


I think not? We want to have a fixed length of the structure which never 
grow.

So the different is:

1) using netlink dedicated fields

if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR])

2) using netlink as transport

if (features & VIRTIO_NET_F_MAC)


>
> I notice several fields of the vduse device is setup via ioctl, which I think should be setup via this vdpa device add interface.
>
> Also we can always wrap above nl_attr code in a helper API so that drivers to not hand-code it.


Then it would be still more like 2) above (wrap netlink back to 
something like virtio_net_config)?


>
>> We may meet similar issue when provision VF/SF instance at the hardware
>> level. So I think we may need something in the virtio spec in the near future.
> Do you mean in a virtio vf and virtio sf?


Yes.


> If so, probably yes.
> Given that we have the ability to transport individual fields, we don't need to attach the U->K UAPI to a undefined and evolving structure.


I don't object but it needs to be done in virtio uAPI instead of 
netlink, since it's the device ABI.


>
>> So assuming we don't want a single attributes to be modified and we want to
>> let user to specify all the attributes at one time during creation.
>>
>> Maybe we can tweak virtio_net_config_set a little bit:
>>
>> struct virtio_net_config_set {
>>           __virtio64 features;
>>           __u8 mac[ETH_ALEN];
>>           __virtio16 max_virtqueue_pairs;
>>           __virtio16 mtu;
>>           __virtio16 reserved[62];
>> }
>>
>> So we have:
>>
>> - both features and config fields, we're self contained
>> - reserved fields which should be sufficient for the next 10 years, so we don't
>> need to care about the growing.
> This is the reverse of netlink which offers to not reserve any arbitrary size structure.


It's not arbitrary but with fixed length.


>   Though I agree that it may not grow.
>
>> Or actually it also allows per field modification.
>>
>> E.g if we don't specify VIRTIO_NET_F_MAC, it means mac field is invalid.
>> So did for qps and mtu.
>>
>> The advantage is that we can standardize this in the virtio spec which could
>> be used for SF/VF provisioning.
> Virtio spec can be still standardized about which fields of config space should be setup.
> To do so, we don't need to lump them in one structure.


Yes, agree.


>
>> For get, we probably need more work:
>>
>> struct virtio_net_config_get {
>>           __virtio64 features;
>>           union {
>>                   struct virtio_net_config;
>>                   __virtio64 reserved[16];
>>           }
>> }
>>
>> Or just follow how it is work today, simply pass the config plus the
>> device_features.
> If we go with individual attribute get and add both sorted out neatly, expandable.


It may only work for netlink (with some duplication with the existing 
virtio uAPI). If we can solve it at general virtio layer, it would be 
better. Otherwise we need to invent them again in the virtio spec.

E.g virito is expected to support something similar to SF, it requires 
the SF to be created/provisioned via the admin virtqueue in the PF.

In this case, we still need to define what is required it create a 
virtio "SF". Netlink can't be used in this context.

I think even for the current mlx5e vDPA it would be better, otherwise we 
may have:

vDPA tool -> [netlink specific vDPA attributes(1)] -> vDPA core -> [vDPA 
core specific VDPA attributes(2)] -> mlx5e_vDPA -> [mlx5e specific vDPA 
attributes(3)] -> mlx5e_core

We need to use a single and unified virtio structure in all the (1), (2) 
and (3).


>
> You already explained that there isn't one to one mapping of features to config fields for other device types too.


Yes, but features + config is self contained. That is to say, it's 
sufficient to explain a specific filed if we had device features.

Thanks


> Netlink already enables us to avoid non symmetric u64 reserved[16] in get and u16 reserved[16] in set.

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-07-01  7:43                                       ` Jason Wang
@ 2021-07-02  6:04                                         ` Parav Pandit
  2021-07-05  4:35                                           ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-07-02  6:04 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: Eli Cohen, mst



> From: Jason Wang <jasowang@redhat.com>
> Sent: Thursday, July 1, 2021 1:13 PM
> 
> 
> Actually it depends on what attributes is required for building the config.
> 
> We can simply reuse the existing virtio_net_config, if most of the fields are
> required.
> 
> struct virtio_net_config_set {
>          __virtio64 features;
>          union {
>              struct virtio_net_config;
>              __virtio64 reserved[64];
>          }
> };
> 
> If only few of the is required, we can just pick them and use another
> structure.
The point is we define structure based on current fields. Tomorrow a new RSS or rx scaling scheme appears, and structure size might need change.
And it demands us to go back to length based typecasting code.
and to avoid some length check we pick some arbitrary size reserved words.
And I do not know what network research group will come up for new rss algorithm and needed plumbing.

> 
> Actually, I think just pass the whole config with the device_features during
> device creation is a good choice that can simplify a lot of things.
Yes. I totally agree to this.

> 
> We can define what is needed and ignore the others in the virtio spec.
> Then there's no need to worry about any other things. vDPA core can just do
> santiy test like checking size vs features.
Yes, we are trying to have code that avoids such sanity checks based on structure size, length etc fields. :-)

> 
> 
> > Instead of doing them as individual netlink attributes, its lumped together
> in a struct of arbitrary length. :-)
> 
> 
> I think not? We want to have a fixed length of the structure which never
> grow.
> 
I am not sure defining that future now is right choice, at least for me.

> So the different is:
> 
> 1) using netlink dedicated fields
> 
> if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR])
> 
> 2) using netlink as transport
> 
> if (features & VIRTIO_NET_F_MAC)
> 
> 
> >
> > I notice several fields of the vduse device is setup via ioctl, which I think
> should be setup via this vdpa device add interface.
> >
> > Also we can always wrap above nl_attr code in a helper API so that drivers
> to not hand-code it.
> 
> 
> Then it would be still more like 2) above (wrap netlink back to
> something like virtio_net_config)?
> 
> 
> >
> >> We may meet similar issue when provision VF/SF instance at the
> hardware
> >> level. So I think we may need something in the virtio spec in the near
> future.

Given the device config is not spelled out in the virtio spec, may be we can wait for it to define virtio management interface.

> 
> I don't object but it needs to be done in virtio uAPI instead of
> netlink, since it's the device ABI.
Device config can surely be part of the virtio uAPI.
We need not have put that in UAPI.
More below.

> > This is the reverse of netlink which offers to not reserve any arbitrary size
> structure.
> 
> 
> It's not arbitrary but with fixed length.
Its fixed, but decided arbitrarily large in anticipation that we likely need to grow.
And sometimes that fall short when next research comes up with more creative thoughts.

> 
> It may only work for netlink (with some duplication with the existing
> virtio uAPI). If we can solve it at general virtio layer, it would be
> better. Otherwise we need to invent them again in the virtio spec.
> 
Virtio spec will likely define what should be config fields to program and its layout.
Kernel can always fill up the format that virtio spec demands.

> I think even for the current mlx5e vDPA it would be better, otherwise we
> may have:
> 
> vDPA tool -> [netlink specific vDPA attributes(1)] -> vDPA core -> [vDPA
> core specific VDPA attributes(2)] -> mlx5e_vDPA -> [mlx5e specific vDPA
> attributes(3)] -> mlx5e_core
> 
> We need to use a single and unified virtio structure in all the (1), (2)
> and (3).
This is where I differ.
Its only vdpa tool -> vdpa core -> vendor_driver

Vdpa tool -> vdpa core = netlink attribute
Vdpa core -> vendor driver = struct_foo. (internal inside the linux kernel)

If tomorrow virtio spec defines struct_foo to be something else, kernel can always upgrade to struct_bar without upgrading UAPI netlink attributes.
Netlink attributes addition will be needed only when struct_foo has newer fields.
This will be still forward/backward compatible.

An exact example of this is drivers/net/vxlan.c
vxlan_nl2conf().
A vxlan device needs VNI, src ip, dst ip, tos, and more.
Instead of putting all in single structure vxlan_config as UAPI, those optional fields are netlink attributes.
And vxlan driver internally fills up the config structure.

I am very much convinced with the above vxlan approach that enables all functionality needed without typecasting code and without defining arbitrary length structs.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-07-02  6:04                                         ` Parav Pandit
@ 2021-07-05  4:35                                           ` Jason Wang
  2021-07-06 17:07                                             ` Parav Pandit
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-07-05  4:35 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: Eli Cohen, mst


在 2021/7/2 下午2:04, Parav Pandit 写道:
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Thursday, July 1, 2021 1:13 PM
>>
>>
>> Actually it depends on what attributes is required for building the config.
>>
>> We can simply reuse the existing virtio_net_config, if most of the fields are
>> required.
>>
>> struct virtio_net_config_set {
>>           __virtio64 features;
>>           union {
>>               struct virtio_net_config;
>>               __virtio64 reserved[64];
>>           }
>> };
>>
>> If only few of the is required, we can just pick them and use another
>> structure.
> The point is we define structure based on current fields. Tomorrow a new RSS or rx scaling scheme appears, and structure size might need change.
> And it demands us to go back to length based typecasting code.
> and to avoid some length check we pick some arbitrary size reserved words.
> And I do not know what network research group will come up for new rss algorithm and needed plumbing.


Yes, but as discussed, we may suffer the similar issue at the device 
level. E.g we need a command to let PF to "build" the config for a VF or SF.


>
>> Actually, I think just pass the whole config with the device_features during
>> device creation is a good choice that can simplify a lot of things.
> Yes. I totally agree to this.
>
>> We can define what is needed and ignore the others in the virtio spec.
>> Then there's no need to worry about any other things. vDPA core can just do
>> santiy test like checking size vs features.
> Yes, we are trying to have code that avoids such sanity checks based on structure size, length etc fields. :-)
>
>>
>>> Instead of doing them as individual netlink attributes, its lumped together
>> in a struct of arbitrary length. :-)
>>
>>
>> I think not? We want to have a fixed length of the structure which never
>> grow.
>>
> I am not sure defining that future now is right choice, at least for me.
>
>> So the different is:
>>
>> 1) using netlink dedicated fields
>>
>> if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR])
>>
>> 2) using netlink as transport
>>
>> if (features & VIRTIO_NET_F_MAC)
>>
>>
>>> I notice several fields of the vduse device is setup via ioctl, which I think
>> should be setup via this vdpa device add interface.
>>> Also we can always wrap above nl_attr code in a helper API so that drivers
>> to not hand-code it.
>>
>>
>> Then it would be still more like 2) above (wrap netlink back to
>> something like virtio_net_config)?
>>
>>
>>>> We may meet similar issue when provision VF/SF instance at the
>> hardware
>>>> level. So I think we may need something in the virtio spec in the near
>> future.
> Given the device config is not spelled out in the virtio spec, may be we can wait for it to define virtio management interface.


Yes.


>
>> I don't object but it needs to be done in virtio uAPI instead of
>> netlink, since it's the device ABI.
> Device config can surely be part of the virtio uAPI.
> We need not have put that in UAPI.
> More below.
>
>>> This is the reverse of netlink which offers to not reserve any arbitrary size
>> structure.
>>
>>
>> It's not arbitrary but with fixed length.
> Its fixed, but decided arbitrarily large in anticipation that we likely need to grow.
> And sometimes that fall short when next research comes up with more creative thoughts.


How about something like TLVs in the virtio spec then?


>
>> It may only work for netlink (with some duplication with the existing
>> virtio uAPI). If we can solve it at general virtio layer, it would be
>> better. Otherwise we need to invent them again in the virtio spec.
>>
> Virtio spec will likely define what should be config fields to program and its layout.
> Kernel can always fill up the format that virtio spec demands.


Yes, I wonder if you have the interest to work on the spec to support this.


>
>> I think even for the current mlx5e vDPA it would be better, otherwise we
>> may have:
>>
>> vDPA tool -> [netlink specific vDPA attributes(1)] -> vDPA core -> [vDPA
>> core specific VDPA attributes(2)] -> mlx5e_vDPA -> [mlx5e specific vDPA
>> attributes(3)] -> mlx5e_core
>>
>> We need to use a single and unified virtio structure in all the (1), (2)
>> and (3).
> This is where I differ.
> Its only vdpa tool -> vdpa core -> vendor_driver
>
> Vdpa tool -> vdpa core = netlink attribute
> Vdpa core -> vendor driver = struct_foo. (internal inside the linux kernel)
>
> If tomorrow virtio spec defines struct_foo to be something else, kernel can always upgrade to struct_bar without upgrading UAPI netlink attributes.


That's fine. Note that actually have an extra level if vendor_driver is 
virtio-pci vDPA driver (vp_vdpa).

Then we have

vdpa tool -> vdpa core -> vp_vdpa -> virtio-pci device

So we still need invent commands to configure/build VF/SF config space 
between vp_vdpa and virtio-pci device. And I think we may suffer the 
similar issue as we met here (vdpa tool -> vdpa core).


> Netlink attributes addition will be needed only when struct_foo has newer fields.
> This will be still forward/backward compatible.
>
> An exact example of this is drivers/net/vxlan.c
> vxlan_nl2conf().
> A vxlan device needs VNI, src ip, dst ip, tos, and more.
> Instead of putting all in single structure vxlan_config as UAPI, those optional fields are netlink attributes.
> And vxlan driver internally fills up the config structure.
>
> I am very much convinced with the above vxlan approach that enables all functionality needed without typecasting code and without defining arbitrary length structs.


Right, but we had some small differences here:

1) vxlan doesn't have a existing uAPI
2) vxlan configuration is not used for hardware

Basically, I'm not against this approach, I just wonder if it's 
better/simpler to solve it at virtio layer because the semantic is 
defined by the spec not netlink.

Thanks


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-07-05  4:35                                           ` Jason Wang
@ 2021-07-06 17:07                                             ` Parav Pandit
  2021-07-07  4:03                                               ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit @ 2021-07-06 17:07 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: Eli Cohen, mst



> From: Jason Wang <jasowang@redhat.com>
> Sent: Monday, July 5, 2021 10:05 AM
> 
> 在 2021/7/2 下午2:04, Parav Pandit 写道:
> >
> >> From: Jason Wang <jasowang@redhat.com>
> >> Sent: Thursday, July 1, 2021 1:13 PM
> >>
> >>
> >> Actually it depends on what attributes is required for building the config.
> >>
> >> We can simply reuse the existing virtio_net_config, if most of the
> >> fields are required.
> >>
> >> struct virtio_net_config_set {
> >>           __virtio64 features;
> >>           union {
> >>               struct virtio_net_config;
> >>               __virtio64 reserved[64];
> >>           }
> >> };
> >>
> >> If only few of the is required, we can just pick them and use another
> >> structure.
> > The point is we define structure based on current fields. Tomorrow a new
> RSS or rx scaling scheme appears, and structure size might need change.
> > And it demands us to go back to length based typecasting code.
> > and to avoid some length check we pick some arbitrary size reserved
> words.
> > And I do not know what network research group will come up for new rss
> algorithm and needed plumbing.
> 
> 
> Yes, but as discussed, we may suffer the similar issue at the device level. E.g
> we need a command to let PF to "build" the config for a VF or SF.
I am not sure.
Current scope of a VDPA is, once there is a has PF,VF,SF and you configure or create a vdpa device out of it.

> > Given the device config is not spelled out in the virtio spec, may be we can
> wait for it to define virtio management interface.
> 
> Yes.
Wait is needed only if we want to cast U->K UAPI in a structure which is bound to evolve.
And hence I just want to exchange as individual fields.

> >> It's not arbitrary but with fixed length.
> > Its fixed, but decided arbitrarily large in anticipation that we likely need to
> grow.
> > And sometimes that fall short when next research comes up with more
> creative thoughts.
> 
> 
> How about something like TLVs in the virtio spec then?
Possibly yes.
> 
> 
> >
> >> It may only work for netlink (with some duplication with the existing
> >> virtio uAPI). If we can solve it at general virtio layer, it would be
> >> better. Otherwise we need to invent them again in the virtio spec.
> >>
> > Virtio spec will likely define what should be config fields to program and its
> layout.
> > Kernel can always fill up the format that virtio spec demands.
> 
> 
> Yes, I wonder if you have the interest to work on the spec to support this.
> 
I am happy to contribute, I need to ask my supervisor to spend some time in this area.
Let me figure out the logistics.

> 
> >
> >> I think even for the current mlx5e vDPA it would be better, otherwise we
> >> may have:
> >>
> >> vDPA tool -> [netlink specific vDPA attributes(1)] -> vDPA core -> [vDPA
> >> core specific VDPA attributes(2)] -> mlx5e_vDPA -> [mlx5e specific vDPA
> >> attributes(3)] -> mlx5e_core
> >>
> >> We need to use a single and unified virtio structure in all the (1), (2)
> >> and (3).
> > This is where I differ.
> > Its only vdpa tool -> vdpa core -> vendor_driver
> >
> > Vdpa tool -> vdpa core = netlink attribute
> > Vdpa core -> vendor driver = struct_foo. (internal inside the linux kernel)
> >
> > If tomorrow virtio spec defines struct_foo to be something else, kernel can
> always upgrade to struct_bar without upgrading UAPI netlink attributes.
> 
> 
> That's fine. Note that actually have an extra level if vendor_driver is
> virtio-pci vDPA driver (vp_vdpa).
> 
> Then we have
> 
> vdpa tool -> vdpa core -> vp_vdpa -> virtio-pci device
> 
> So we still need invent commands to configure/build VF/SF config space
> between vp_vdpa and virtio-pci device. 
Yes. This is needed, but again lets keep the two layers separate.
In the example I provided, we will be able to fill the structure and pass this internally between vp_vdpa->virtio pci driver.


> And I think we may suffer the
> similar issue as we met here (vdpa tool -> vdpa core).
> 
> 
> > Netlink attributes addition will be needed only when struct_foo has newer
> fields.
> > This will be still forward/backward compatible.
> >
> > An exact example of this is drivers/net/vxlan.c
> > vxlan_nl2conf().
> > A vxlan device needs VNI, src ip, dst ip, tos, and more.
> > Instead of putting all in single structure vxlan_config as UAPI, those
> optional fields are netlink attributes.
> > And vxlan driver internally fills up the config structure.
> >
> > I am very much convinced with the above vxlan approach that enables all
> functionality needed without typecasting code and without defining arbitrary
> length structs.
> 
> 
> Right, but we had some small differences here:
> 
> 1) vxlan doesn't have a existing uAPI
> 2) vxlan configuration is not used for hardware
> 
True but vxlan example doesn’t prevent to do #2.

> Basically, I'm not against this approach, I just wonder if it's
> better/simpler to solve it at virtio layer because the semantic is
> defined by the spec not netlink.

vdpa core will be able to use the virtio spec defined config whenever it occurs.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout
  2021-07-06 17:07                                             ` Parav Pandit
@ 2021-07-07  4:03                                               ` Jason Wang
  0 siblings, 0 replies; 62+ messages in thread
From: Jason Wang @ 2021-07-07  4:03 UTC (permalink / raw)
  To: Parav Pandit, virtualization, mst; +Cc: Eli Cohen


在 2021/7/7 上午1:07, Parav Pandit 写道:
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Monday, July 5, 2021 10:05 AM
>>
>> 在 2021/7/2 下午2:04, Parav Pandit 写道:
>>>> From: Jason Wang <jasowang@redhat.com>
>>>> Sent: Thursday, July 1, 2021 1:13 PM
>>>>
>>>>
>>>> Actually it depends on what attributes is required for building the config.
>>>>
>>>> We can simply reuse the existing virtio_net_config, if most of the
>>>> fields are required.
>>>>
>>>> struct virtio_net_config_set {
>>>>            __virtio64 features;
>>>>            union {
>>>>                struct virtio_net_config;
>>>>                __virtio64 reserved[64];
>>>>            }
>>>> };
>>>>
>>>> If only few of the is required, we can just pick them and use another
>>>> structure.
>>> The point is we define structure based on current fields. Tomorrow a new
>> RSS or rx scaling scheme appears, and structure size might need change.
>>> And it demands us to go back to length based typecasting code.
>>> and to avoid some length check we pick some arbitrary size reserved
>> words.
>>> And I do not know what network research group will come up for new rss
>> algorithm and needed plumbing.
>>
>>
>> Yes, but as discussed, we may suffer the similar issue at the device level. E.g
>> we need a command to let PF to "build" the config for a VF or SF.
> I am not sure.
> Current scope of a VDPA is, once there is a has PF,VF,SF and you configure or create a vdpa device out of it.
>
>>> Given the device config is not spelled out in the virtio spec, may be we can
>> wait for it to define virtio management interface.
>>
>> Yes.
> Wait is needed only if we want to cast U->K UAPI in a structure which is bound to evolve.
> And hence I just want to exchange as individual fields.
>
>>>> It's not arbitrary but with fixed length.
>>> Its fixed, but decided arbitrarily large in anticipation that we likely need to
>> grow.
>>> And sometimes that fall short when next research comes up with more
>> creative thoughts.
>>
>>
>> How about something like TLVs in the virtio spec then?
> Possibly yes.
>>
>>>> It may only work for netlink (with some duplication with the existing
>>>> virtio uAPI). If we can solve it at general virtio layer, it would be
>>>> better. Otherwise we need to invent them again in the virtio spec.
>>>>
>>> Virtio spec will likely define what should be config fields to program and its
>> layout.
>>> Kernel can always fill up the format that virtio spec demands.
>>
>> Yes, I wonder if you have the interest to work on the spec to support this.
>>
> I am happy to contribute, I need to ask my supervisor to spend some time in this area.
> Let me figure out the logistics.


Good to know that.


>
>>>> I think even for the current mlx5e vDPA it would be better, otherwise we
>>>> may have:
>>>>
>>>> vDPA tool -> [netlink specific vDPA attributes(1)] -> vDPA core -> [vDPA
>>>> core specific VDPA attributes(2)] -> mlx5e_vDPA -> [mlx5e specific vDPA
>>>> attributes(3)] -> mlx5e_core
>>>>
>>>> We need to use a single and unified virtio structure in all the (1), (2)
>>>> and (3).
>>> This is where I differ.
>>> Its only vdpa tool -> vdpa core -> vendor_driver
>>>
>>> Vdpa tool -> vdpa core = netlink attribute
>>> Vdpa core -> vendor driver = struct_foo. (internal inside the linux kernel)
>>>
>>> If tomorrow virtio spec defines struct_foo to be something else, kernel can
>> always upgrade to struct_bar without upgrading UAPI netlink attributes.
>>
>>
>> That's fine. Note that actually have an extra level if vendor_driver is
>> virtio-pci vDPA driver (vp_vdpa).
>>
>> Then we have
>>
>> vdpa tool -> vdpa core -> vp_vdpa -> virtio-pci device
>>
>> So we still need invent commands to configure/build VF/SF config space
>> between vp_vdpa and virtio-pci device.
> Yes. This is needed, but again lets keep the two layers separate.
> In the example I provided, we will be able to fill the structure and pass this internally between vp_vdpa->virtio pci driver.
>
>
>> And I think we may suffer the
>> similar issue as we met here (vdpa tool -> vdpa core).
>>
>>
>>> Netlink attributes addition will be needed only when struct_foo has newer
>> fields.
>>> This will be still forward/backward compatible.
>>>
>>> An exact example of this is drivers/net/vxlan.c
>>> vxlan_nl2conf().
>>> A vxlan device needs VNI, src ip, dst ip, tos, and more.
>>> Instead of putting all in single structure vxlan_config as UAPI, those
>> optional fields are netlink attributes.
>>> And vxlan driver internally fills up the config structure.
>>>
>>> I am very much convinced with the above vxlan approach that enables all
>> functionality needed without typecasting code and without defining arbitrary
>> length structs.
>>
>>
>> Right, but we had some small differences here:
>>
>> 1) vxlan doesn't have a existing uAPI
>> 2) vxlan configuration is not used for hardware
>>
> True but vxlan example doesn’t prevent to do #2.
>
>> Basically, I'm not against this approach, I just wonder if it's
>> better/simpler to solve it at virtio layer because the semantic is
>> defined by the spec not netlink.
> vdpa core will be able to use the virtio spec defined config whenever it occurs.


So I think both of us have strong points. Maybe it's the time for 
Michael to decide how it will go.

Michael, please share your thoughts here.

Thanks


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-06-16 19:11 [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu Parav Pandit
                   ` (5 preceding siblings ...)
  2021-06-16 19:11 ` [PATCH linux-next v3 6/6] vdpa/mlx5: Forward only packets with allowed MAC address Parav Pandit
@ 2021-08-05  9:57 ` Michael S. Tsirkin
  2021-08-05 10:13   ` Parav Pandit via Virtualization
  2021-08-06  2:50   ` Jason Wang
  6 siblings, 2 replies; 62+ messages in thread
From: Michael S. Tsirkin @ 2021-08-05  9:57 UTC (permalink / raw)
  To: Parav Pandit; +Cc: elic, virtualization

On Wed, Jun 16, 2021 at 10:11:49PM +0300, Parav Pandit wrote:
> Currently user cannot set the mac address and mtu of the vdpa device.
> This patchset enables users to set the mac address and mtu of the vdpa
> device once the device is created.
> If a vendor driver supports such configuration user can set it otherwise
> user gets unsupported error.

This makes sense to me overall. People are used to
use netlink to set these parameters, and virtio does
not necessarily have a way to set all device
parameters - they can be RO in the config space.


> vdpa mac address and mtu are device configuration layout fields.
> To keep interface generic enough for multiple types of vdpa devices, mac
> address and mtu setting is implemented as configuration layout config
> knobs.
> This enables to use similar config layout for other virtio devices.
> 
> An example of query & set of config layout fields for vdpa_sim_net
> driver:
> 
> Configuration layout fields are set after device is created.
> This enables user to change such fields at later point without destroying and
> recreating the device for new config.
> 
> $ vdpa mgmtdev show
> vdpasim_net:
>   supported_classes net
> 
> Add the device:
> $ vdpa dev add name bar mgmtdev vdpasim_net
> 
> Configure mac address and mtu:
> $ vdpa dev config set bar mac 00:11:22:33:44:55 mtu 9000
> 
> In above command only mac address or only mtu can also be set.
> 
> View the config after setting:
> $ vdpa dev config show
> bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000 speed 0 duplex 0
> 
> Patch summary:
> Patch-1 introduced and use helpers for get/set config area
> Patch-2 implement query device config layout
> Patch-3 enanble user to set mac and mtu in config space
> Patch-4 vdpa_sim_net implements get and set of config layout
> Patch-5 mlx5 vdpa driver supports user provided mac config
> Patch-6 mlx5 vdpa driver uses user provided mac during rx flow steering
> 
> changelog:
> v2->v3:
>  - dropped patches which are merged
>  - simplified code to handle non transitional devices
> 
> v1->v2:
>  - new patches to fix kdoc comment to add new kdoc section
>  - new patch to have synchronized access to features and config space
>  - read whole net config layout instead of individual fields
>  - added error extack for unmanaged vdpa device
>  - fixed several endianness issues
>  - introduced vdpa device ops for get config which is synchronized
>    with other get/set features ops and config ops
>  - fixed mtu range checking for max
>  - using NLA_POLICY_ETH_ADDR
>  - set config moved to device ops instead of mgmtdev ops
>  - merged build and set to single routine
>  - ensuring that user has NET_ADMIN capability for configuring network
>    attributes
>  - using updated interface and callbacks for get/set config
>  - following new api for config get/set for mgmt tool in mlx5 vdpa
>    driver
>  - fixes for accessing right SF dma device and bar address
>  - fix for mtu calculation
>  - fix for bit access in features
>  - fix for index restore with suspend/resume operation
> 
> 
> Eli Cohen (2):
>   vdpa/mlx5: Support configuration of MAC
>   vdpa/mlx5: Forward only packets with allowed MAC address
> 
> Parav Pandit (4):
>   vdpa: Introduce and use vdpa device get, set config helpers
>   vdpa: Introduce query of device config layout
>   vdpa: Enable user to set mac and mtu of vdpa device
>   vdpa_sim_net: Enable user to set mac address and mtu
> 
>  drivers/vdpa/mlx5/net/mlx5_vnet.c    | 101 ++++++--
>  drivers/vdpa/vdpa.c                  | 337 +++++++++++++++++++++++++++
>  drivers/vdpa/vdpa_sim/vdpa_sim.c     |  13 ++
>  drivers/vdpa/vdpa_sim/vdpa_sim.h     |   2 +
>  drivers/vdpa/vdpa_sim/vdpa_sim_net.c |  34 +--
>  drivers/vhost/vdpa.c                 |   3 +-
>  include/linux/vdpa.h                 |  38 +--
>  include/uapi/linux/vdpa.h            |  12 +
>  8 files changed, 490 insertions(+), 50 deletions(-)
> 
> -- 
> 2.26.2

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-05  9:57 ` [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu Michael S. Tsirkin
@ 2021-08-05 10:13   ` Parav Pandit via Virtualization
  2021-08-05 12:05     ` Michael S. Tsirkin
  2021-08-06  2:50   ` Jason Wang
  1 sibling, 1 reply; 62+ messages in thread
From: Parav Pandit via Virtualization @ 2021-08-05 10:13 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Eli Cohen, virtualization


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, August 5, 2021 3:28 PM
> 
> On Wed, Jun 16, 2021 at 10:11:49PM +0300, Parav Pandit wrote:
> > Currently user cannot set the mac address and mtu of the vdpa device.
> > This patchset enables users to set the mac address and mtu of the vdpa
> > device once the device is created.
> > If a vendor driver supports such configuration user can set it
> > otherwise user gets unsupported error.
> 
> This makes sense to me overall. People are used to use netlink to set these
> parameters, and virtio does not necessarily have a way to set all device
> parameters - they can be RO in the config space.

Yes. this series enables it to RO when driver doesn't support setting it.
When driver supports it, it is RW.

Do I need to rebase + resend?
Please let me know.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-05 10:13   ` Parav Pandit via Virtualization
@ 2021-08-05 12:05     ` Michael S. Tsirkin
  0 siblings, 0 replies; 62+ messages in thread
From: Michael S. Tsirkin @ 2021-08-05 12:05 UTC (permalink / raw)
  To: Parav Pandit; +Cc: Eli Cohen, virtualization

On Thu, Aug 05, 2021 at 10:13:11AM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, August 5, 2021 3:28 PM
> > 
> > On Wed, Jun 16, 2021 at 10:11:49PM +0300, Parav Pandit wrote:
> > > Currently user cannot set the mac address and mtu of the vdpa device.
> > > This patchset enables users to set the mac address and mtu of the vdpa
> > > device once the device is created.
> > > If a vendor driver supports such configuration user can set it
> > > otherwise user gets unsupported error.
> > 
> > This makes sense to me overall. People are used to use netlink to set these
> > parameters, and virtio does not necessarily have a way to set all device
> > parameters - they can be RO in the config space.
> 
> Yes. this series enables it to RO when driver doesn't support setting it.
> When driver supports it, it is RW.
> 
> Do I need to rebase + resend?
> Please let me know.

Can't hurt.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-05  9:57 ` [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu Michael S. Tsirkin
  2021-08-05 10:13   ` Parav Pandit via Virtualization
@ 2021-08-06  2:50   ` Jason Wang
  2021-08-06  8:42     ` Michael S. Tsirkin
  1 sibling, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-08-06  2:50 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit; +Cc: elic, virtualization


在 2021/8/5 下午5:57, Michael S. Tsirkin 写道:
> On Wed, Jun 16, 2021 at 10:11:49PM +0300, Parav Pandit wrote:
>> Currently user cannot set the mac address and mtu of the vdpa device.
>> This patchset enables users to set the mac address and mtu of the vdpa
>> device once the device is created.
>> If a vendor driver supports such configuration user can set it otherwise
>> user gets unsupported error.
> This makes sense to me overall. People are used to
> use netlink to set these parameters, and virtio does
> not necessarily have a way to set all device
> parameters - they can be RO in the config space.


I don't get here, we need to care RO as well (e.g the max_virtqueue_pairs).

And do we really want netlink uAPI for virtio like:

  enum vdpa_attr {
@@ -33,6 +34,16 @@ enum vdpa_attr {
  	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
  	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
  
+	VDPA_ATTR_DEV_NET_CFG_MACADDR,		/* binary */
+	VDPA_ATTR_DEV_NET_STATUS,		/* u8 */
+	VDPA_ATTR_DEV_NET_CFG_MAX_VQP,		/* u16 */
+	VDPA_ATTR_DEV_NET_CFG_MTU,		/* u16 */
+	VDPA_ATTR_DEV_NET_CFG_SPEED,		/* u16 */
+	VDPA_ATTR_DEV_NET_CFG_DUPLEX,		/* u16 */
+	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,	/* u8 */
+	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,	/* u16 */
+	VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,	/* u32 */
+
  	/* new attributes must be added above here */
  	VDPA_ATTR_MAX,
  };

Or virtio uAPI and make netlink a transport?

I prefer the latter since we will meet the similar issue at the hardware 
level when we want to create and provision virtio device dynamically.

Thanks


>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-06  2:50   ` Jason Wang
@ 2021-08-06  8:42     ` Michael S. Tsirkin
  2021-08-06  8:55       ` Parav Pandit via Virtualization
  0 siblings, 1 reply; 62+ messages in thread
From: Michael S. Tsirkin @ 2021-08-06  8:42 UTC (permalink / raw)
  To: Jason Wang; +Cc: elic, virtualization

On Fri, Aug 06, 2021 at 10:50:27AM +0800, Jason Wang wrote:
> 
> 在 2021/8/5 下午5:57, Michael S. Tsirkin 写道:
> > On Wed, Jun 16, 2021 at 10:11:49PM +0300, Parav Pandit wrote:
> > > Currently user cannot set the mac address and mtu of the vdpa device.
> > > This patchset enables users to set the mac address and mtu of the vdpa
> > > device once the device is created.
> > > If a vendor driver supports such configuration user can set it otherwise
> > > user gets unsupported error.
> > This makes sense to me overall. People are used to
> > use netlink to set these parameters, and virtio does
> > not necessarily have a way to set all device
> > parameters - they can be RO in the config space.
> 
> 
> I don't get here, we need to care RO as well (e.g the max_virtqueue_pairs).

Point I tried to make is, a virtio transport will not allow writing
max_virtqueue_pairs, but when managing virtio VFs from a PF we do need
to set it.  Thus virtio devices need a new set of interfaces for
managing them, it is not just a virtio transport.

> And do we really want netlink uAPI for virtio like:
> 
>  enum vdpa_attr {
> @@ -33,6 +34,16 @@ enum vdpa_attr {
>  	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
>  	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
> +	VDPA_ATTR_DEV_NET_CFG_MACADDR,		/* binary */
> +	VDPA_ATTR_DEV_NET_STATUS,		/* u8 */
> +	VDPA_ATTR_DEV_NET_CFG_MAX_VQP,		/* u16 */
> +	VDPA_ATTR_DEV_NET_CFG_MTU,		/* u16 */
> +	VDPA_ATTR_DEV_NET_CFG_SPEED,		/* u16 */
> +	VDPA_ATTR_DEV_NET_CFG_DUPLEX,		/* u16 */
> +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,	/* u8 */
> +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,	/* u16 */
> +	VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,	/* u32 */
> +
>  	/* new attributes must be added above here */
>  	VDPA_ATTR_MAX,
>  };

The point is to try and not reinvent a dedicated vpda interface
where a generic one exits.
E.g. for phy things such as mac speed etc, I think most people are using
ethtool things right?

> Or virtio uAPI and make netlink a transport?
> 
> I prefer the latter since we will meet the similar issue at the hardware
> level when we want to create and provision virtio device dynamically.
> 
> Thanks

Creating devices dynamically exists with e.g. vxlan.
That is using IFLA_MTU IFLA_ADDRESS etc.


> 
> > 
> > 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-06  8:42     ` Michael S. Tsirkin
@ 2021-08-06  8:55       ` Parav Pandit via Virtualization
  2021-08-09  3:07         ` Jason Wang
  2021-08-09  9:40         ` Michael S. Tsirkin
  0 siblings, 2 replies; 62+ messages in thread
From: Parav Pandit via Virtualization @ 2021-08-06  8:55 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang; +Cc: Eli Cohen, virtualization



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Friday, August 6, 2021 2:12 PM


> >  enum vdpa_attr {
> > @@ -33,6 +34,16 @@ enum vdpa_attr {
> >  	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
> >  	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
> > +	VDPA_ATTR_DEV_NET_CFG_MACADDR,		/* binary */
> > +	VDPA_ATTR_DEV_NET_STATUS,		/* u8 */
> > +	VDPA_ATTR_DEV_NET_CFG_MAX_VQP,		/* u16 */
> > +	VDPA_ATTR_DEV_NET_CFG_MTU,		/* u16 */
> > +	VDPA_ATTR_DEV_NET_CFG_SPEED,		/* u16 */
> > +	VDPA_ATTR_DEV_NET_CFG_DUPLEX,		/* u16 */
> > +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,	/* u8 */
> > +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,	/* u16 */
> > +	VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,	/* u32 */
> > +
> >  	/* new attributes must be added above here */
> >  	VDPA_ATTR_MAX,
> >  };
> 
> The point is to try and not reinvent a dedicated vpda interface where a
> generic one exits.
> E.g. for phy things such as mac speed etc, I think most people are using
> ethtool things right?

As you know vdpa is the backend device for the front-end netdevice accessed by the ethtool.
vdpa management tool here is composing the vdpa device.

For example creator (hypervisor) of the vdpa devices knows that a guest VM is given 4 vcpus,
So hypervisor creates a vdpa devices with config space layout as,
max_virtqueue_pairs = 4.
And the MAC address chosen by hypervisor in mac[6].

Guest VM ethtool can still chose to use less number of channels.

Typically,
ethtool is for guest VM.
vdpa device is in hypevisor.

How can hypervisor compose a vdpa device without any tool?
How can it tell ethtool, what is supported and what are the defaults?

I must be misunderstanding your comment about ethtool.
Can you please explain?

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-06  8:55       ` Parav Pandit via Virtualization
@ 2021-08-09  3:07         ` Jason Wang
  2021-08-09  3:13           ` Parav Pandit via Virtualization
       [not found]           ` <20210809052121.GA209158@mtl-vdi-166.wap.labs.mlnx>
  2021-08-09  9:40         ` Michael S. Tsirkin
  1 sibling, 2 replies; 62+ messages in thread
From: Jason Wang @ 2021-08-09  3:07 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin; +Cc: Eli Cohen, virtualization


在 2021/8/6 下午4:55, Parav Pandit 写道:
>
>> From: Michael S. Tsirkin <mst@redhat.com>
>> Sent: Friday, August 6, 2021 2:12 PM
>
>>>   enum vdpa_attr {
>>> @@ -33,6 +34,16 @@ enum vdpa_attr {
>>>   	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
>>>   	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
>>> +	VDPA_ATTR_DEV_NET_CFG_MACADDR,		/* binary */
>>> +	VDPA_ATTR_DEV_NET_STATUS,		/* u8 */
>>> +	VDPA_ATTR_DEV_NET_CFG_MAX_VQP,		/* u16 */
>>> +	VDPA_ATTR_DEV_NET_CFG_MTU,		/* u16 */
>>> +	VDPA_ATTR_DEV_NET_CFG_SPEED,		/* u16 */
>>> +	VDPA_ATTR_DEV_NET_CFG_DUPLEX,		/* u16 */
>>> +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,	/* u8 */
>>> +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,	/* u16 */
>>> +	VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,	/* u32 */
>>> +
>>>   	/* new attributes must be added above here */
>>>   	VDPA_ATTR_MAX,
>>>   };
>> The point is to try and not reinvent a dedicated vpda interface where a
>> generic one exits.
>> E.g. for phy things such as mac speed etc, I think most people are using
>> ethtool things right?
> As you know vdpa is the backend device for the front-end netdevice accessed by the ethtool.
> vdpa management tool here is composing the vdpa device.
>
> For example creator (hypervisor) of the vdpa devices knows that a guest VM is given 4 vcpus,
> So hypervisor creates a vdpa devices with config space layout as,
> max_virtqueue_pairs = 4.
> And the MAC address chosen by hypervisor in mac[6].
>
> Guest VM ethtool can still chose to use less number of channels.
>
> Typically,
> ethtool is for guest VM.
> vdpa device is in hypevisor.
>
> How can hypervisor compose a vdpa device without any tool?
> How can it tell ethtool, what is supported and what are the defaults?


Reread the cover letter:

"

This patchset enables users to set the mac address and mtu of the vdpa
device once the device is created.

"

It looks to me the mechanism that introduced in the series is not for 
provisioning but for post-creation configuration?


>
> I must be misunderstanding your comment about ethtool.
> Can you please explain?


I guess the meaning is that, if the vDPA is assigned to guest, it's the 
charge of guest to configure the MTU/MAC/RSS via the existing management 
interface like ethtool/iproute2 netlink protocol. The control virtqueue 
is designed for this.

But if it was used for provisioning, it looks like another topic which 
should be done during the device creation.

Thanks


>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-09  3:07         ` Jason Wang
@ 2021-08-09  3:13           ` Parav Pandit via Virtualization
  2021-08-09  3:29             ` Jason Wang
       [not found]           ` <20210809052121.GA209158@mtl-vdi-166.wap.labs.mlnx>
  1 sibling, 1 reply; 62+ messages in thread
From: Parav Pandit via Virtualization @ 2021-08-09  3:13 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin; +Cc: Eli Cohen, virtualization

Hi Jason,

> From: Jason Wang <jasowang@redhat.com>
> Sent: Monday, August 9, 2021 8:38 AM
> > For example creator (hypervisor) of the vdpa devices knows that a
> > guest VM is given 4 vcpus, So hypervisor creates a vdpa devices with
> > config space layout as, max_virtqueue_pairs = 4.
> > And the MAC address chosen by hypervisor in mac[6].
> >
> > Guest VM ethtool can still chose to use less number of channels.
> >
> > Typically,
> > ethtool is for guest VM.
> > vdpa device is in hypevisor.
> >
> > How can hypervisor compose a vdpa device without any tool?
> > How can it tell ethtool, what is supported and what are the defaults?
> 
> 
> Reread the cover letter:
> 
> "
> 
> This patchset enables users to set the mac address and mtu of the vdpa
> device once the device is created.
> 
> "
> 
> It looks to me the mechanism that introduced in the series is not for
> provisioning but for post-creation configuration?
> 
> 
> >
> > I must be misunderstanding your comment about ethtool.
> > Can you please explain?
> 
> 
> I guess the meaning is that, if the vDPA is assigned to guest, it's the
> charge of guest to configure the MTU/MAC/RSS via the existing management
> interface like ethtool/iproute2 netlink protocol. The control virtqueue
> is designed for this.
> 
> But if it was used for provisioning, it looks like another topic which
> should be done during the device creation.

We already discussed and agreed, that I should change these params as creation time params instead post-creation.
We were waiting for Michael to respond if he is ok with either 
(a) extendible individual param or 
(b) should prefer to see a typecast based structure blob coming through netlink.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-09  3:13           ` Parav Pandit via Virtualization
@ 2021-08-09  3:29             ` Jason Wang
  0 siblings, 0 replies; 62+ messages in thread
From: Jason Wang @ 2021-08-09  3:29 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin; +Cc: Eli Cohen, virtualization


在 2021/8/9 上午11:13, Parav Pandit 写道:
> Hi Jason,
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Monday, August 9, 2021 8:38 AM
>>> For example creator (hypervisor) of the vdpa devices knows that a
>>> guest VM is given 4 vcpus, So hypervisor creates a vdpa devices with
>>> config space layout as, max_virtqueue_pairs = 4.
>>> And the MAC address chosen by hypervisor in mac[6].
>>>
>>> Guest VM ethtool can still chose to use less number of channels.
>>>
>>> Typically,
>>> ethtool is for guest VM.
>>> vdpa device is in hypevisor.
>>>
>>> How can hypervisor compose a vdpa device without any tool?
>>> How can it tell ethtool, what is supported and what are the defaults?
>>
>> Reread the cover letter:
>>
>> "
>>
>> This patchset enables users to set the mac address and mtu of the vdpa
>> device once the device is created.
>>
>> "
>>
>> It looks to me the mechanism that introduced in the series is not for
>> provisioning but for post-creation configuration?
>>
>>
>>> I must be misunderstanding your comment about ethtool.
>>> Can you please explain?
>>
>> I guess the meaning is that, if the vDPA is assigned to guest, it's the
>> charge of guest to configure the MTU/MAC/RSS via the existing management
>> interface like ethtool/iproute2 netlink protocol. The control virtqueue
>> is designed for this.
>>
>> But if it was used for provisioning, it looks like another topic which
>> should be done during the device creation.
> We already discussed and agreed, that I should change these params as creation time params instead post-creation.
> We were waiting for Michael to respond if he is ok with either
> (a) extendible individual param or
> (b) should prefer to see a typecast based structure blob coming through netlink.


Right. So it's for creation time params.

Michael, please advise which way is better.

Thanks


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
       [not found]           ` <20210809052121.GA209158@mtl-vdi-166.wap.labs.mlnx>
@ 2021-08-09  5:42             ` Parav Pandit via Virtualization
       [not found]               ` <20210809055748.GA210406@mtl-vdi-166.wap.labs.mlnx>
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit via Virtualization @ 2021-08-09  5:42 UTC (permalink / raw)
  To: Eli Cohen, Jason Wang; +Cc: virtualization, Michael S. Tsirkin



> From: Eli Cohen <elic@nvidia.com>
> Sent: Monday, August 9, 2021 10:51 AM
> 
> On Mon, Aug 09, 2021 at 11:07:50AM +0800, Jason Wang wrote:
> >
> > 在 2021/8/6 下午4:55, Parav Pandit 写道:
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Friday, August 6, 2021 2:12 PM
> > >
> > > > >   enum vdpa_attr {
> > > > > @@ -33,6 +34,16 @@ enum vdpa_attr {
> > > > >   	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
> > > > >   	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
> > > > > +	VDPA_ATTR_DEV_NET_CFG_MACADDR,		/*
> binary */
> > > > > +	VDPA_ATTR_DEV_NET_STATUS,		/* u8 */
> > > > > +	VDPA_ATTR_DEV_NET_CFG_MAX_VQP,		/*
> u16 */
> > > > > +	VDPA_ATTR_DEV_NET_CFG_MTU,		/* u16 */
> > > > > +	VDPA_ATTR_DEV_NET_CFG_SPEED,		/* u16 */
> > > > > +	VDPA_ATTR_DEV_NET_CFG_DUPLEX,		/* u16 */
> > > > > +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,	/* u8
> */
> > > > > +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,	/*
> u16 */
> > > > > +	VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,	/*
> u32 */
> > > > > +
> > > > >   	/* new attributes must be added above here */
> > > > >   	VDPA_ATTR_MAX,
> > > > >   };
> > > > The point is to try and not reinvent a dedicated vpda interface
> > > > where a generic one exits.
> > > > E.g. for phy things such as mac speed etc, I think most people are
> > > > using ethtool things right?
> > > As you know vdpa is the backend device for the front-end netdevice
> accessed by the ethtool.
> > > vdpa management tool here is composing the vdpa device.
> > >
> > > For example creator (hypervisor) of the vdpa devices knows that a
> > > guest VM is given 4 vcpus, So hypervisor creates a vdpa devices with
> > > config space layout as, max_virtqueue_pairs = 4.
> > > And the MAC address chosen by hypervisor in mac[6].
> > >
> > > Guest VM ethtool can still chose to use less number of channels.
> > >
> > > Typically,
> > > ethtool is for guest VM.
> > > vdpa device is in hypevisor.
> > >
> > > How can hypervisor compose a vdpa device without any tool?
> > > How can it tell ethtool, what is supported and what are the defaults?
> >
> >
> > Reread the cover letter:
> >
> > "
> >
> > This patchset enables users to set the mac address and mtu of the vdpa
> > device once the device is created.
> >
> > "
> >
> > It looks to me the mechanism that introduced in the series is not for
> > provisioning but for post-creation configuration?
> >
> 
> The difference is subtle. In both cases you provide configuration.
> 
> 
> >
> > >
> > > I must be misunderstanding your comment about ethtool.
> > > Can you please explain?
> >
> >
> > I guess the meaning is that, if the vDPA is assigned to guest, it's
> > the charge of guest to configure the MTU/MAC/RSS via the existing
> > management interface like ethtool/iproute2 netlink protocol. The
> > control virtqueue is designed for this.
> >
> 
> I was under the impression that we want somehow to control the capablity if
> the guest to use arbitrary MAC addresses.
> If this is is not required than I think control VQ is the mechanism to use.
How does the guest VM identify which unique mac address to set on this virtio net device when this is the only device in the VM?
Usually hypervisor knows what mac to set for a VM.

How do you set up the config space of the vdpa device migrating from source to destination hypervisor?
Is this done through qemu vhost framework to setup the config space?
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
       [not found]               ` <20210809055748.GA210406@mtl-vdi-166.wap.labs.mlnx>
@ 2021-08-09  6:01                 ` Parav Pandit via Virtualization
       [not found]                   ` <20210809060746.GA210718@mtl-vdi-166.wap.labs.mlnx>
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit via Virtualization @ 2021-08-09  6:01 UTC (permalink / raw)
  To: Eli Cohen; +Cc: virtualization, Michael S. Tsirkin



> From: Eli Cohen <elic@nvidia.com>
> Sent: Monday, August 9, 2021 11:28 AM
> 
> On Mon, Aug 09, 2021 at 08:42:58AM +0300, Parav Pandit wrote:
> >
> >
> > > From: Eli Cohen <elic@nvidia.com>
> > > Sent: Monday, August 9, 2021 10:51 AM
> > >
> > > On Mon, Aug 09, 2021 at 11:07:50AM +0800, Jason Wang wrote:
> > > >
> > > > 在 2021/8/6 下午4:55, Parav Pandit 写道:
> > > > >
> > > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > > Sent: Friday, August 6, 2021 2:12 PM
> > > > >
> > > > > > >   enum vdpa_attr {
> > > > > > > @@ -33,6 +34,16 @@ enum vdpa_attr {
> > > > > > >   	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
> > > > > > >   	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
> > > > > > > +	VDPA_ATTR_DEV_NET_CFG_MACADDR,		/*
> > > binary */
> > > > > > > +	VDPA_ATTR_DEV_NET_STATUS,		/* u8 */
> > > > > > > +	VDPA_ATTR_DEV_NET_CFG_MAX_VQP,		/*
> > > u16 */
> > > > > > > +	VDPA_ATTR_DEV_NET_CFG_MTU,		/* u16 */
> > > > > > > +	VDPA_ATTR_DEV_NET_CFG_SPEED,		/* u16 */
> > > > > > > +	VDPA_ATTR_DEV_NET_CFG_DUPLEX,		/* u16 */
> > > > > > > +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,	/* u8
> > > */
> > > > > > > +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,	/*
> > > u16 */
> > > > > > > +	VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,	/*
> > > u32 */
> > > > > > > +
> > > > > > >   	/* new attributes must be added above here */
> > > > > > >   	VDPA_ATTR_MAX,
> > > > > > >   };
> > > > > > The point is to try and not reinvent a dedicated vpda
> > > > > > interface where a generic one exits.
> > > > > > E.g. for phy things such as mac speed etc, I think most people
> > > > > > are using ethtool things right?
> > > > > As you know vdpa is the backend device for the front-end
> > > > > netdevice
> > > accessed by the ethtool.
> > > > > vdpa management tool here is composing the vdpa device.
> > > > >
> > > > > For example creator (hypervisor) of the vdpa devices knows that
> > > > > a guest VM is given 4 vcpus, So hypervisor creates a vdpa
> > > > > devices with config space layout as, max_virtqueue_pairs = 4.
> > > > > And the MAC address chosen by hypervisor in mac[6].
> > > > >
> > > > > Guest VM ethtool can still chose to use less number of channels.
> > > > >
> > > > > Typically,
> > > > > ethtool is for guest VM.
> > > > > vdpa device is in hypevisor.
> > > > >
> > > > > How can hypervisor compose a vdpa device without any tool?
> > > > > How can it tell ethtool, what is supported and what are the defaults?
> > > >
> > > >
> > > > Reread the cover letter:
> > > >
> > > > "
> > > >
> > > > This patchset enables users to set the mac address and mtu of the
> > > > vdpa device once the device is created.
> > > >
> > > > "
> > > >
> > > > It looks to me the mechanism that introduced in the series is not
> > > > for provisioning but for post-creation configuration?
> > > >
> > >
> > > The difference is subtle. In both cases you provide configuration.
> > >
> > >
> > > >
> > > > >
> > > > > I must be misunderstanding your comment about ethtool.
> > > > > Can you please explain?
> > > >
> > > >
> > > > I guess the meaning is that, if the vDPA is assigned to guest,
> > > > it's the charge of guest to configure the MTU/MAC/RSS via the
> > > > existing management interface like ethtool/iproute2 netlink
> > > > protocol. The control virtqueue is designed for this.
> > > >
> > >
> > > I was under the impression that we want somehow to control the
> > > capablity if the guest to use arbitrary MAC addresses.
> > > If this is is not required than I think control VQ is the mechanism to use.
> > How does the guest VM identify which unique mac address to set on this
> virtio net device when this is the only device in the VM?
> > Usually hypervisor knows what mac to set for a VM.
> 
> You don't need to know. You could use any MAC you want, if no one else is
> using it in your subnet, and everything will work fine. The point is do you
> want to allow the guest to choose its MAC. This has implications of security.
> 
Lets assume for a moment that a guest VM is able to program a MAC of netdevice of virtio_device of net type.
How does a VM know that a randomly chosen mac is not used in network when this VM doesn’t have any external connectivity?
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
       [not found]                   ` <20210809060746.GA210718@mtl-vdi-166.wap.labs.mlnx>
@ 2021-08-09  6:10                     ` Parav Pandit via Virtualization
  2021-08-09  7:05                       ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit via Virtualization @ 2021-08-09  6:10 UTC (permalink / raw)
  To: Eli Cohen; +Cc: virtualization, Michael S. Tsirkin



> From: Eli Cohen <elic@nvidia.com>
> Sent: Monday, August 9, 2021 11:38 AM
> 
> On Mon, Aug 09, 2021 at 09:01:48AM +0300, Parav Pandit wrote:
> >
> >
> > > From: Eli Cohen <elic@nvidia.com>
> > > Sent: Monday, August 9, 2021 11:28 AM
> > >
> > > On Mon, Aug 09, 2021 at 08:42:58AM +0300, Parav Pandit wrote:
> > > >
> > > >
> > > > > From: Eli Cohen <elic@nvidia.com>
> > > > > Sent: Monday, August 9, 2021 10:51 AM
> > > > >
> > > > > On Mon, Aug 09, 2021 at 11:07:50AM +0800, Jason Wang wrote:
> > > > > >
> > > > > > 在 2021/8/6 下午4:55, Parav Pandit 写道:
> > > > > > >
> > > > > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > > > > Sent: Friday, August 6, 2021 2:12 PM
> > > > > > >
> > > > > > > > >   enum vdpa_attr {
> > > > > > > > > @@ -33,6 +34,16 @@ enum vdpa_attr {
> > > > > > > > >   	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
> > > > > > > > >   	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
> > > > > > > > > +	VDPA_ATTR_DEV_NET_CFG_MACADDR,
> 	/*
> > > > > binary */
> > > > > > > > > +	VDPA_ATTR_DEV_NET_STATUS,		/* u8
> */
> > > > > > > > > +	VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
> 	/*
> > > > > u16 */
> > > > > > > > > +	VDPA_ATTR_DEV_NET_CFG_MTU,		/*
> u16 */
> > > > > > > > > +	VDPA_ATTR_DEV_NET_CFG_SPEED,		/*
> u16 */
> > > > > > > > > +	VDPA_ATTR_DEV_NET_CFG_DUPLEX,		/*
> u16 */
> > > > > > > > > +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,
> 	/* u8
> > > > > */
> > > > > > > > > +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,
> 	/*
> > > > > u16 */
> > > > > > > > > +	VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,
> 	/*
> > > > > u32 */
> > > > > > > > > +
> > > > > > > > >   	/* new attributes must be added above here */
> > > > > > > > >   	VDPA_ATTR_MAX,
> > > > > > > > >   };
> > > > > > > > The point is to try and not reinvent a dedicated vpda
> > > > > > > > interface where a generic one exits.
> > > > > > > > E.g. for phy things such as mac speed etc, I think most
> > > > > > > > people are using ethtool things right?
> > > > > > > As you know vdpa is the backend device for the front-end
> > > > > > > netdevice
> > > > > accessed by the ethtool.
> > > > > > > vdpa management tool here is composing the vdpa device.
> > > > > > >
> > > > > > > For example creator (hypervisor) of the vdpa devices knows
> > > > > > > that a guest VM is given 4 vcpus, So hypervisor creates a
> > > > > > > vdpa devices with config space layout as, max_virtqueue_pairs =
> 4.
> > > > > > > And the MAC address chosen by hypervisor in mac[6].
> > > > > > >
> > > > > > > Guest VM ethtool can still chose to use less number of channels.
> > > > > > >
> > > > > > > Typically,
> > > > > > > ethtool is for guest VM.
> > > > > > > vdpa device is in hypevisor.
> > > > > > >
> > > > > > > How can hypervisor compose a vdpa device without any tool?
> > > > > > > How can it tell ethtool, what is supported and what are the
> defaults?
> > > > > >
> > > > > >
> > > > > > Reread the cover letter:
> > > > > >
> > > > > > "
> > > > > >
> > > > > > This patchset enables users to set the mac address and mtu of
> > > > > > the vdpa device once the device is created.
> > > > > >
> > > > > > "
> > > > > >
> > > > > > It looks to me the mechanism that introduced in the series is
> > > > > > not for provisioning but for post-creation configuration?
> > > > > >
> > > > >
> > > > > The difference is subtle. In both cases you provide configuration.
> > > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > I must be misunderstanding your comment about ethtool.
> > > > > > > Can you please explain?
> > > > > >
> > > > > >
> > > > > > I guess the meaning is that, if the vDPA is assigned to guest,
> > > > > > it's the charge of guest to configure the MTU/MAC/RSS via the
> > > > > > existing management interface like ethtool/iproute2 netlink
> > > > > > protocol. The control virtqueue is designed for this.
> > > > > >
> > > > >
> > > > > I was under the impression that we want somehow to control the
> > > > > capablity if the guest to use arbitrary MAC addresses.
> > > > > If this is is not required than I think control VQ is the mechanism to
> use.
> > > > How does the guest VM identify which unique mac address to set on
> > > > this
> > > virtio net device when this is the only device in the VM?
> > > > Usually hypervisor knows what mac to set for a VM.
> > >
> > > You don't need to know. You could use any MAC you want, if no one
> > > else is using it in your subnet, and everything will work fine. The
> > > point is do you want to allow the guest to choose its MAC. This has
> implications of security.
> > >
> > Lets assume for a moment that a guest VM is able to program a MAC of
> netdevice of virtio_device of net type.
> > How does a VM know that a randomly chosen mac is not used in network
> when this VM doesn’t have any external connectivity?
> 
> There's no gurantee 
Hence a VF users, relies on the hypervisor to setup a unique MAC in the network.

And advance guest VM who has a device capable to modify the MAC (for bonding and other use) can override the mac.

So I don’t see them as mutually exclusive capability.

> but it's being used all over and Linux has a specific API to
> generate random MAC addresses: eth_hw_addr_random().
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-09  6:10                     ` Parav Pandit via Virtualization
@ 2021-08-09  7:05                       ` Jason Wang
  2021-08-16 20:51                         ` Michael S. Tsirkin
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-08-09  7:05 UTC (permalink / raw)
  To: Parav Pandit; +Cc: Eli Cohen, virtualization, Michael S. Tsirkin

On Mon, Aug 9, 2021 at 2:10 PM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Eli Cohen <elic@nvidia.com>
> > Sent: Monday, August 9, 2021 11:38 AM
> >
> > On Mon, Aug 09, 2021 at 09:01:48AM +0300, Parav Pandit wrote:
> > >
> > >
> > > > From: Eli Cohen <elic@nvidia.com>
> > > > Sent: Monday, August 9, 2021 11:28 AM
> > > >
> > > > On Mon, Aug 09, 2021 at 08:42:58AM +0300, Parav Pandit wrote:
> > > > >
> > > > >
> > > > > > From: Eli Cohen <elic@nvidia.com>
> > > > > > Sent: Monday, August 9, 2021 10:51 AM
> > > > > >
> > > > > > On Mon, Aug 09, 2021 at 11:07:50AM +0800, Jason Wang wrote:
> > > > > > >
> > > > > > > 在 2021/8/6 下午4:55, Parav Pandit 写道:
> > > > > > > >
> > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > > > > > Sent: Friday, August 6, 2021 2:12 PM
> > > > > > > >
> > > > > > > > > >   enum vdpa_attr {
> > > > > > > > > > @@ -33,6 +34,16 @@ enum vdpa_attr {
> > > > > > > > > >       VDPA_ATTR_DEV_MAX_VQS,                  /* u32 */
> > > > > > > > > >       VDPA_ATTR_DEV_MAX_VQ_SIZE,              /* u16 */
> > > > > > > > > > +     VDPA_ATTR_DEV_NET_CFG_MACADDR,
> >       /*
> > > > > > binary */
> > > > > > > > > > +     VDPA_ATTR_DEV_NET_STATUS,               /* u8
> > */
> > > > > > > > > > +     VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
> >       /*
> > > > > > u16 */
> > > > > > > > > > +     VDPA_ATTR_DEV_NET_CFG_MTU,              /*
> > u16 */
> > > > > > > > > > +     VDPA_ATTR_DEV_NET_CFG_SPEED,            /*
> > u16 */
> > > > > > > > > > +     VDPA_ATTR_DEV_NET_CFG_DUPLEX,           /*
> > u16 */
> > > > > > > > > > +     VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,
> >       /* u8
> > > > > > */
> > > > > > > > > > +     VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,
> >       /*
> > > > > > u16 */
> > > > > > > > > > +     VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,
> >       /*
> > > > > > u32 */
> > > > > > > > > > +
> > > > > > > > > >       /* new attributes must be added above here */
> > > > > > > > > >       VDPA_ATTR_MAX,
> > > > > > > > > >   };
> > > > > > > > > The point is to try and not reinvent a dedicated vpda
> > > > > > > > > interface where a generic one exits.
> > > > > > > > > E.g. for phy things such as mac speed etc, I think most
> > > > > > > > > people are using ethtool things right?
> > > > > > > > As you know vdpa is the backend device for the front-end
> > > > > > > > netdevice
> > > > > > accessed by the ethtool.
> > > > > > > > vdpa management tool here is composing the vdpa device.
> > > > > > > >
> > > > > > > > For example creator (hypervisor) of the vdpa devices knows
> > > > > > > > that a guest VM is given 4 vcpus, So hypervisor creates a
> > > > > > > > vdpa devices with config space layout as, max_virtqueue_pairs =
> > 4.
> > > > > > > > And the MAC address chosen by hypervisor in mac[6].
> > > > > > > >
> > > > > > > > Guest VM ethtool can still chose to use less number of channels.
> > > > > > > >
> > > > > > > > Typically,
> > > > > > > > ethtool is for guest VM.
> > > > > > > > vdpa device is in hypevisor.
> > > > > > > >
> > > > > > > > How can hypervisor compose a vdpa device without any tool?
> > > > > > > > How can it tell ethtool, what is supported and what are the
> > defaults?
> > > > > > >
> > > > > > >
> > > > > > > Reread the cover letter:
> > > > > > >
> > > > > > > "
> > > > > > >
> > > > > > > This patchset enables users to set the mac address and mtu of
> > > > > > > the vdpa device once the device is created.
> > > > > > >
> > > > > > > "
> > > > > > >
> > > > > > > It looks to me the mechanism that introduced in the series is
> > > > > > > not for provisioning but for post-creation configuration?
> > > > > > >
> > > > > >
> > > > > > The difference is subtle. In both cases you provide configuration.
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > I must be misunderstanding your comment about ethtool.
> > > > > > > > Can you please explain?
> > > > > > >
> > > > > > >
> > > > > > > I guess the meaning is that, if the vDPA is assigned to guest,
> > > > > > > it's the charge of guest to configure the MTU/MAC/RSS via the
> > > > > > > existing management interface like ethtool/iproute2 netlink
> > > > > > > protocol. The control virtqueue is designed for this.
> > > > > > >
> > > > > >
> > > > > > I was under the impression that we want somehow to control the
> > > > > > capablity if the guest to use arbitrary MAC addresses.
> > > > > > If this is is not required than I think control VQ is the mechanism to
> > use.
> > > > > How does the guest VM identify which unique mac address to set on
> > > > > this
> > > > virtio net device when this is the only device in the VM?
> > > > > Usually hypervisor knows what mac to set for a VM.
> > > >
> > > > You don't need to know. You could use any MAC you want, if no one
> > > > else is using it in your subnet, and everything will work fine. The
> > > > point is do you want to allow the guest to choose its MAC. This has
> > implications of security.
> > > >
> > > Lets assume for a moment that a guest VM is able to program a MAC of
> > netdevice of virtio_device of net type.
> > > How does a VM know that a randomly chosen mac is not used in network
> > when this VM doesn’t have any external connectivity?
> >
> > There's no gurantee
> Hence a VF users, relies on the hypervisor to setup a unique MAC in the network.
>
> And advance guest VM who has a device capable to modify the MAC (for bonding and other use) can override the mac.
>
> So I don’t see them as mutually exclusive capability.

Yes, we probably need both.

CVQ for post creation configuration and netlink API for provisioning.

>
> > but it's being used all over and Linux has a specific API to
> > generate random MAC addresses: eth_hw_addr_random().

Yes, but it uses local assignment bit, management may want use others.

Thanks

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-06  8:55       ` Parav Pandit via Virtualization
  2021-08-09  3:07         ` Jason Wang
@ 2021-08-09  9:40         ` Michael S. Tsirkin
  2021-08-09  9:51           ` Parav Pandit via Virtualization
  1 sibling, 1 reply; 62+ messages in thread
From: Michael S. Tsirkin @ 2021-08-09  9:40 UTC (permalink / raw)
  To: Parav Pandit; +Cc: Eli Cohen, virtualization

On Fri, Aug 06, 2021 at 08:55:56AM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Friday, August 6, 2021 2:12 PM
> 
> 
> > >  enum vdpa_attr {
> > > @@ -33,6 +34,16 @@ enum vdpa_attr {
> > >  	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
> > >  	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
> > > +	VDPA_ATTR_DEV_NET_CFG_MACADDR,		/* binary */
> > > +	VDPA_ATTR_DEV_NET_STATUS,		/* u8 */
> > > +	VDPA_ATTR_DEV_NET_CFG_MAX_VQP,		/* u16 */
> > > +	VDPA_ATTR_DEV_NET_CFG_MTU,		/* u16 */
> > > +	VDPA_ATTR_DEV_NET_CFG_SPEED,		/* u16 */
> > > +	VDPA_ATTR_DEV_NET_CFG_DUPLEX,		/* u16 */
> > > +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,	/* u8 */
> > > +	VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,	/* u16 */
> > > +	VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,	/* u32 */
> > > +
> > >  	/* new attributes must be added above here */
> > >  	VDPA_ATTR_MAX,
> > >  };
> > 
> > The point is to try and not reinvent a dedicated vpda interface where a
> > generic one exits.
> > E.g. for phy things such as mac speed etc, I think most people are using
> > ethtool things right?
> 
> As you know vdpa is the backend device for the front-end netdevice accessed by the ethtool.
> vdpa management tool here is composing the vdpa device.
> 
> For example creator (hypervisor) of the vdpa devices knows that a guest VM is given 4 vcpus,
> So hypervisor creates a vdpa devices with config space layout as,
> max_virtqueue_pairs = 4.
> And the MAC address chosen by hypervisor in mac[6].
> 
> Guest VM ethtool can still chose to use less number of channels.
> 
> Typically,
> ethtool is for guest VM.
> vdpa device is in hypevisor.
> 
> How can hypervisor compose a vdpa device without any tool?
> How can it tell ethtool, what is supported and what are the defaults?
> 
> I must be misunderstanding your comment about ethtool.
> Can you please explain?


I am basically saying that we probably want to be able to
change MAC of a VDPA device on the host without desroying and recreating the device
as long as it's not in use.

For a VF I can do this on the host:

ip link set eth0 vf 1 mac 00:11:22:33:44:55

ideally same thing would work for vdpa.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-09  9:40         ` Michael S. Tsirkin
@ 2021-08-09  9:51           ` Parav Pandit via Virtualization
  2021-08-16 20:54             ` Michael S. Tsirkin
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit via Virtualization @ 2021-08-09  9:51 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Eli Cohen, virtualization

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, August 9, 2021 3:10 PM
> 
> On Fri, Aug 06, 2021 at 08:55:56AM +0000, Parav Pandit wrote:
> >
> >
> > >
> > > The point is to try and not reinvent a dedicated vpda interface
> > > where a generic one exits.
> > > E.g. for phy things such as mac speed etc, I think most people are
> > > using ethtool things right?
> >
> > As you know vdpa is the backend device for the front-end netdevice
> accessed by the ethtool.
> > vdpa management tool here is composing the vdpa device.
> >
> > For example creator (hypervisor) of the vdpa devices knows that a
> > guest VM is given 4 vcpus, So hypervisor creates a vdpa devices with
> > config space layout as, max_virtqueue_pairs = 4.
> > And the MAC address chosen by hypervisor in mac[6].
> >
> > Guest VM ethtool can still chose to use less number of channels.
> >
> > Typically,
> > ethtool is for guest VM.
> > vdpa device is in hypevisor.
> >
> > How can hypervisor compose a vdpa device without any tool?
> > How can it tell ethtool, what is supported and what are the defaults?
> >
> > I must be misunderstanding your comment about ethtool.
> > Can you please explain?
> 
> 
> I am basically saying that we probably want to be able to change MAC of a
> VDPA device on the host without desroying and recreating the device as long
> as it's not in use.
Ok. I understood your comment now.
Yes, this was the objective which is why they are present as independent config knob.
Jason was suggesting to have them as creation only knobs, which requires recreate.

I don't have strong opinion for either method.

Passing them at creation time is simpler for user.
If user needs the ability to modify and reuse same device with different config, extending such support in future like this patch should possible.

So there are two questions to close.
1. Can we start with config params at vdpa device creation time?

2. Is it ok to have these config params as individual fields at netlink U->K UAPI level?
This is the method proposed in this patch series.
(Similar to incrementally growing vxlan ip link command).

Or 
They should be packed in a structure between U-> K and deal with typecasting based on size and more?
(Jason's input).
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-09  7:05                       ` Jason Wang
@ 2021-08-16 20:51                         ` Michael S. Tsirkin
  0 siblings, 0 replies; 62+ messages in thread
From: Michael S. Tsirkin @ 2021-08-16 20:51 UTC (permalink / raw)
  To: Jason Wang; +Cc: Eli Cohen, virtualization

On Mon, Aug 09, 2021 at 03:05:31PM +0800, Jason Wang wrote:
> > So I don’t see them as mutually exclusive capability.
> 
> Yes, we probably need both.
> 
> CVQ for post creation configuration and netlink API for provisioning.

To note when host wants to change post provisioning it will also use
netlink I guess.

> >
> > > but it's being used all over and Linux has a specific API to
> > > generate random MAC addresses: eth_hw_addr_random().
> 
> Yes, but it uses local assignment bit, management may want use others.
> 
> Thanks

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-09  9:51           ` Parav Pandit via Virtualization
@ 2021-08-16 20:54             ` Michael S. Tsirkin
  2021-08-18  3:14               ` Parav Pandit via Virtualization
  0 siblings, 1 reply; 62+ messages in thread
From: Michael S. Tsirkin @ 2021-08-16 20:54 UTC (permalink / raw)
  To: Parav Pandit; +Cc: Eli Cohen, virtualization

On Mon, Aug 09, 2021 at 09:51:49AM +0000, Parav Pandit wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, August 9, 2021 3:10 PM
> > 
> > On Fri, Aug 06, 2021 at 08:55:56AM +0000, Parav Pandit wrote:
> > >
> > >
> > > >
> > > > The point is to try and not reinvent a dedicated vpda interface
> > > > where a generic one exits.
> > > > E.g. for phy things such as mac speed etc, I think most people are
> > > > using ethtool things right?
> > >
> > > As you know vdpa is the backend device for the front-end netdevice
> > accessed by the ethtool.
> > > vdpa management tool here is composing the vdpa device.
> > >
> > > For example creator (hypervisor) of the vdpa devices knows that a
> > > guest VM is given 4 vcpus, So hypervisor creates a vdpa devices with
> > > config space layout as, max_virtqueue_pairs = 4.
> > > And the MAC address chosen by hypervisor in mac[6].
> > >
> > > Guest VM ethtool can still chose to use less number of channels.
> > >
> > > Typically,
> > > ethtool is for guest VM.
> > > vdpa device is in hypevisor.
> > >
> > > How can hypervisor compose a vdpa device without any tool?
> > > How can it tell ethtool, what is supported and what are the defaults?
> > >
> > > I must be misunderstanding your comment about ethtool.
> > > Can you please explain?
> > 
> > 
> > I am basically saying that we probably want to be able to change MAC of a
> > VDPA device on the host without desroying and recreating the device as long
> > as it's not in use.
> Ok. I understood your comment now.
> Yes, this was the objective which is why they are present as independent config knob.
> Jason was suggesting to have them as creation only knobs, which requires recreate.
> 
> I don't have strong opinion for either method.
> 
> Passing them at creation time is simpler for user.
> If user needs the ability to modify and reuse same device with different config, extending such support in future like this patch should possible.
> 
> So there are two questions to close.
> 1. Can we start with config params at vdpa device creation time?

I'm not sure whether we need both but I'd like to see a full API
and I think we all agree host wants ability to tweak mac after
device creation even if guest is not allowed to change mac, right?

> 2. Is it ok to have these config params as individual fields at netlink U->K UAPI level?
> This is the method proposed in this patch series.
> (Similar to incrementally growing vxlan ip link command).
> 
> Or 
> They should be packed in a structure between U-> K and deal with typecasting based on size and more?
> (Jason's input).

I'm inclined to say vxlan is closer to a model to follow.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-16 20:54             ` Michael S. Tsirkin
@ 2021-08-18  3:14               ` Parav Pandit via Virtualization
  2021-08-18  4:31                 ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit via Virtualization @ 2021-08-18  3:14 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Eli Cohen, virtualization



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, August 17, 2021 2:24 AM
> 
> On Mon, Aug 09, 2021 at 09:51:49AM +0000, Parav Pandit wrote:
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Monday, August 9, 2021 3:10 PM
> > >
> > > On Fri, Aug 06, 2021 at 08:55:56AM +0000, Parav Pandit wrote:
> > > >
> > > >
> > > > >
> > > > > The point is to try and not reinvent a dedicated vpda interface
> > > > > where a generic one exits.
> > > > > E.g. for phy things such as mac speed etc, I think most people
> > > > > are using ethtool things right?
> > > >
> > > > As you know vdpa is the backend device for the front-end netdevice
> > > accessed by the ethtool.
> > > > vdpa management tool here is composing the vdpa device.
> > > >
> > > > For example creator (hypervisor) of the vdpa devices knows that a
> > > > guest VM is given 4 vcpus, So hypervisor creates a vdpa devices
> > > > with config space layout as, max_virtqueue_pairs = 4.
> > > > And the MAC address chosen by hypervisor in mac[6].
> > > >
> > > > Guest VM ethtool can still chose to use less number of channels.
> > > >
> > > > Typically,
> > > > ethtool is for guest VM.
> > > > vdpa device is in hypevisor.
> > > >
> > > > How can hypervisor compose a vdpa device without any tool?
> > > > How can it tell ethtool, what is supported and what are the defaults?
> > > >
> > > > I must be misunderstanding your comment about ethtool.
> > > > Can you please explain?
> > >
> > >
> > > I am basically saying that we probably want to be able to change MAC
> > > of a VDPA device on the host without desroying and recreating the
> > > device as long as it's not in use.
> > Ok. I understood your comment now.
> > Yes, this was the objective which is why they are present as independent
> config knob.
> > Jason was suggesting to have them as creation only knobs, which requires
> recreate.
> >
> > I don't have strong opinion for either method.
> >
> > Passing them at creation time is simpler for user.
> > If user needs the ability to modify and reuse same device with different
> config, extending such support in future like this patch should possible.
> >
> > So there are two questions to close.
> > 1. Can we start with config params at vdpa device creation time?
> 
> I'm not sure whether we need both but I'd like to see a full API and I think we
> all agree host wants ability to tweak mac after device creation even if guest is
> not allowed to change mac, right?
>
Yes.
$ vdpa dev add name foo mgmtdev pci/0000:03:00.0 mac 00:11:22:33:44:55 maxvqs 8 mtu 9000

Above API if we do at creation time. It is likely simpler for user to pass necessary params during creation time.
 
> > 2. Is it ok to have these config params as individual fields at netlink U->K
> UAPI level?
> > This is the method proposed in this patch series.
> > (Similar to incrementally growing vxlan ip link command).
> >
> > Or
> > They should be packed in a structure between U-> K and deal with
> typecasting based on size and more?
> > (Jason's input).
> 
> I'm inclined to say vxlan is closer to a model to follow.
Ok. thanks for the feedback. We are using the model close to vxlan.
Lets resolve should we have it at creation time, post creation or both?
(a) Creation time 
Pros: 
- simpler single api for user
- eliminates needs of inventing stats reset in future series
Cons:
- inability to reuse the device with different config
- This may not be of great advantage, and it is probably fine to have creation time params

(b) post creation time:
Pros:
- able to reuse the device with different config for say different VM.
- will require stats reset in future once stats are implemented
Cons:
- more commands for users to config a device, better to have the ability at create time.

> 
> --
> MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-18  3:14               ` Parav Pandit via Virtualization
@ 2021-08-18  4:31                 ` Jason Wang
  2021-08-18  4:36                   ` Parav Pandit via Virtualization
  2021-08-18 17:33                   ` Michael S. Tsirkin
  0 siblings, 2 replies; 62+ messages in thread
From: Jason Wang @ 2021-08-18  4:31 UTC (permalink / raw)
  To: Parav Pandit; +Cc: Eli Cohen, virtualization, Michael S. Tsirkin

On Wed, Aug 18, 2021 at 11:15 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, August 17, 2021 2:24 AM
> >
> > On Mon, Aug 09, 2021 at 09:51:49AM +0000, Parav Pandit wrote:
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Monday, August 9, 2021 3:10 PM
> > > >
> > > > On Fri, Aug 06, 2021 at 08:55:56AM +0000, Parav Pandit wrote:
> > > > >
> > > > >
> > > > > >
> > > > > > The point is to try and not reinvent a dedicated vpda interface
> > > > > > where a generic one exits.
> > > > > > E.g. for phy things such as mac speed etc, I think most people
> > > > > > are using ethtool things right?
> > > > >
> > > > > As you know vdpa is the backend device for the front-end netdevice
> > > > accessed by the ethtool.
> > > > > vdpa management tool here is composing the vdpa device.
> > > > >
> > > > > For example creator (hypervisor) of the vdpa devices knows that a
> > > > > guest VM is given 4 vcpus, So hypervisor creates a vdpa devices
> > > > > with config space layout as, max_virtqueue_pairs = 4.
> > > > > And the MAC address chosen by hypervisor in mac[6].
> > > > >
> > > > > Guest VM ethtool can still chose to use less number of channels.
> > > > >
> > > > > Typically,
> > > > > ethtool is for guest VM.
> > > > > vdpa device is in hypevisor.
> > > > >
> > > > > How can hypervisor compose a vdpa device without any tool?
> > > > > How can it tell ethtool, what is supported and what are the defaults?
> > > > >
> > > > > I must be misunderstanding your comment about ethtool.
> > > > > Can you please explain?
> > > >
> > > >
> > > > I am basically saying that we probably want to be able to change MAC
> > > > of a VDPA device on the host without desroying and recreating the
> > > > device as long as it's not in use.
> > > Ok. I understood your comment now.
> > > Yes, this was the objective which is why they are present as independent
> > config knob.
> > > Jason was suggesting to have them as creation only knobs, which requires
> > recreate.
> > >
> > > I don't have strong opinion for either method.
> > >
> > > Passing them at creation time is simpler for user.
> > > If user needs the ability to modify and reuse same device with different
> > config, extending such support in future like this patch should possible.
> > >
> > > So there are two questions to close.
> > > 1. Can we start with config params at vdpa device creation time?
> >
> > I'm not sure whether we need both but I'd like to see a full API and I think we
> > all agree host wants ability to tweak mac after device creation even if guest is
> > not allowed to change mac, right?
> >
> Yes.
> $ vdpa dev add name foo mgmtdev pci/0000:03:00.0 mac 00:11:22:33:44:55 maxvqs 8 mtu 9000
>
> Above API if we do at creation time. It is likely simpler for user to pass necessary params during creation time.
>
> > > 2. Is it ok to have these config params as individual fields at netlink U->K
> > UAPI level?
> > > This is the method proposed in this patch series.
> > > (Similar to incrementally growing vxlan ip link command).
> > >
> > > Or
> > > They should be packed in a structure between U-> K and deal with
> > typecasting based on size and more?
> > > (Jason's input).
> >
> > I'm inclined to say vxlan is closer to a model to follow.
> Ok. thanks for the feedback. We are using the model close to vxlan.
> Lets resolve should we have it at creation time, post creation or both?
> (a) Creation time
> Pros:
> - simpler single api for user
> - eliminates needs of inventing stats reset in future series
> Cons:
> - inability to reuse the device with different config

This can be solved by destroying the instance and re-creating it with
a different params?

> - This may not be of great advantage, and it is probably fine to have creation time params
>
> (b) post creation time:
> Pros:
> - able to reuse the device with different config for say different VM.
> - will require stats reset in future once stats are implemented

Any reason for doing this other than re-creating the device?

> Cons:
> - more commands for users to config a device, better to have the ability at create time.

We probably need to support post creation but it should be device specific.

E.g we may support device resize for virtio-blk devices.

But it can be done on top I think.

Thanks

>
> >
> > --
> > MST
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-18  4:31                 ` Jason Wang
@ 2021-08-18  4:36                   ` Parav Pandit via Virtualization
  2021-08-19  4:18                     ` Jason Wang
  2021-08-18 17:33                   ` Michael S. Tsirkin
  1 sibling, 1 reply; 62+ messages in thread
From: Parav Pandit via Virtualization @ 2021-08-18  4:36 UTC (permalink / raw)
  To: Jason Wang; +Cc: Eli Cohen, virtualization, Michael S. Tsirkin



> From: Jason Wang <jasowang@redhat.com>
> Sent: Wednesday, August 18, 2021 10:02 AM
> On Wed, Aug 18, 2021 at 11:15 AM Parav Pandit <parav@nvidia.com> wrote:
[..]
> > > I'm inclined to say vxlan is closer to a model to follow.
> > Ok. thanks for the feedback. We are using the model close to vxlan.
> > Lets resolve should we have it at creation time, post creation or both?
> > (a) Creation time
> > Pros:
> > - simpler single api for user
> > - eliminates needs of inventing stats reset in future series
> > Cons:
> > - inability to reuse the device with different config
> 
> This can be solved by destroying the instance and re-creating it with a
> different params?
> 
Yes, which is what I tried be say below.

> > - This may not be of great advantage, and it is probably fine to have creation time params
      ^^^^^ here.

> >
> > (b) post creation time:
> > Pros:
> > - able to reuse the device with different config for say different VM.
> > - will require stats reset in future once stats are implemented
> 
> Any reason for doing this other than re-creating the device?
> 
No. Only reason I can think of is, device reconfig may be faster than recreate.
But I weigh user simplicity more at the beginning and optimizations to bring later if required.

> > Cons:
> > - more commands for users to config a device, better to have the ability at
> create time.
> 
> We probably need to support post creation but it should be device specific.

True. Your below device resize is good example of it.

> 
> E.g we may support device resize for virtio-blk devices.
> 
> But it can be done on top I think.
I think so too.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-18  4:31                 ` Jason Wang
  2021-08-18  4:36                   ` Parav Pandit via Virtualization
@ 2021-08-18 17:33                   ` Michael S. Tsirkin
  2021-08-19  4:22                     ` Jason Wang
  1 sibling, 1 reply; 62+ messages in thread
From: Michael S. Tsirkin @ 2021-08-18 17:33 UTC (permalink / raw)
  To: Jason Wang; +Cc: Eli Cohen, virtualization

On Wed, Aug 18, 2021 at 12:31:39PM +0800, Jason Wang wrote:
> On Wed, Aug 18, 2021 at 11:15 AM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Tuesday, August 17, 2021 2:24 AM
> > >
> > > On Mon, Aug 09, 2021 at 09:51:49AM +0000, Parav Pandit wrote:
> > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > Sent: Monday, August 9, 2021 3:10 PM
> > > > >
> > > > > On Fri, Aug 06, 2021 at 08:55:56AM +0000, Parav Pandit wrote:
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > The point is to try and not reinvent a dedicated vpda interface
> > > > > > > where a generic one exits.
> > > > > > > E.g. for phy things such as mac speed etc, I think most people
> > > > > > > are using ethtool things right?
> > > > > >
> > > > > > As you know vdpa is the backend device for the front-end netdevice
> > > > > accessed by the ethtool.
> > > > > > vdpa management tool here is composing the vdpa device.
> > > > > >
> > > > > > For example creator (hypervisor) of the vdpa devices knows that a
> > > > > > guest VM is given 4 vcpus, So hypervisor creates a vdpa devices
> > > > > > with config space layout as, max_virtqueue_pairs = 4.
> > > > > > And the MAC address chosen by hypervisor in mac[6].
> > > > > >
> > > > > > Guest VM ethtool can still chose to use less number of channels.
> > > > > >
> > > > > > Typically,
> > > > > > ethtool is for guest VM.
> > > > > > vdpa device is in hypevisor.
> > > > > >
> > > > > > How can hypervisor compose a vdpa device without any tool?
> > > > > > How can it tell ethtool, what is supported and what are the defaults?
> > > > > >
> > > > > > I must be misunderstanding your comment about ethtool.
> > > > > > Can you please explain?
> > > > >
> > > > >
> > > > > I am basically saying that we probably want to be able to change MAC
> > > > > of a VDPA device on the host without desroying and recreating the
> > > > > device as long as it's not in use.
> > > > Ok. I understood your comment now.
> > > > Yes, this was the objective which is why they are present as independent
> > > config knob.
> > > > Jason was suggesting to have them as creation only knobs, which requires
> > > recreate.
> > > >
> > > > I don't have strong opinion for either method.
> > > >
> > > > Passing them at creation time is simpler for user.
> > > > If user needs the ability to modify and reuse same device with different
> > > config, extending such support in future like this patch should possible.
> > > >
> > > > So there are two questions to close.
> > > > 1. Can we start with config params at vdpa device creation time?
> > >
> > > I'm not sure whether we need both but I'd like to see a full API and I think we
> > > all agree host wants ability to tweak mac after device creation even if guest is
> > > not allowed to change mac, right?
> > >
> > Yes.
> > $ vdpa dev add name foo mgmtdev pci/0000:03:00.0 mac 00:11:22:33:44:55 maxvqs 8 mtu 9000
> >
> > Above API if we do at creation time. It is likely simpler for user to pass necessary params during creation time.
> >
> > > > 2. Is it ok to have these config params as individual fields at netlink U->K
> > > UAPI level?
> > > > This is the method proposed in this patch series.
> > > > (Similar to incrementally growing vxlan ip link command).
> > > >
> > > > Or
> > > > They should be packed in a structure between U-> K and deal with
> > > typecasting based on size and more?
> > > > (Jason's input).
> > >
> > > I'm inclined to say vxlan is closer to a model to follow.
> > Ok. thanks for the feedback. We are using the model close to vxlan.
> > Lets resolve should we have it at creation time, post creation or both?
> > (a) Creation time
> > Pros:
> > - simpler single api for user
> > - eliminates needs of inventing stats reset in future series
> > Cons:
> > - inability to reuse the device with different config
> 
> This can be solved by destroying the instance and re-creating it with
> a different params?
> 
> > - This may not be of great advantage, and it is probably fine to have creation time params
> >
> > (b) post creation time:
> > Pros:
> > - able to reuse the device with different config for say different VM.
> > - will require stats reset in future once stats are implemented
> 
> Any reason for doing this other than re-creating the device?

Permissions.


> > Cons:
> > - more commands for users to config a device, better to have the ability at create time.
> 
> We probably need to support post creation but it should be device specific.
> 
> E.g we may support device resize for virtio-blk devices.
> 
> But it can be done on top I think.
> 
> Thanks
> 
> >
> > >
> > > --
> > > MST
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-18  4:36                   ` Parav Pandit via Virtualization
@ 2021-08-19  4:18                     ` Jason Wang
  0 siblings, 0 replies; 62+ messages in thread
From: Jason Wang @ 2021-08-19  4:18 UTC (permalink / raw)
  To: Parav Pandit; +Cc: Eli Cohen, virtualization, Michael S. Tsirkin


在 2021/8/18 下午12:36, Parav Pandit 写道:
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Wednesday, August 18, 2021 10:02 AM
>> On Wed, Aug 18, 2021 at 11:15 AM Parav Pandit <parav@nvidia.com> wrote:
> [..]
>>>> I'm inclined to say vxlan is closer to a model to follow.
>>> Ok. thanks for the feedback. We are using the model close to vxlan.
>>> Lets resolve should we have it at creation time, post creation or both?
>>> (a) Creation time
>>> Pros:
>>> - simpler single api for user
>>> - eliminates needs of inventing stats reset in future series
>>> Cons:
>>> - inability to reuse the device with different config
>> This can be solved by destroying the instance and re-creating it with a
>> different params?
>>
> Yes, which is what I tried be say below.
>
>>> - This may not be of great advantage, and it is probably fine to have creation time params
>        ^^^^^ here.


Oh right, I miss this since it belongs to the bullets of "cons".


>
>>> (b) post creation time:
>>> Pros:
>>> - able to reuse the device with different config for say different VM.
>>> - will require stats reset in future once stats are implemented
>> Any reason for doing this other than re-creating the device?
>>
> No. Only reason I can think of is, device reconfig may be faster than recreate.
> But I weigh user simplicity more at the beginning and optimizations to bring later if required.
>

Right, but it looks to me we should only allow the post-creation changes 
only if it is allowed by the spec.

E.g if we allow mac to be modified by the guest, it is not expected to 
be changed from the host.


>>> Cons:
>>> - more commands for users to config a device, better to have the ability at
>> create time.
>>
>> We probably need to support post creation but it should be device specific.
> True. Your below device resize is good example of it.
>
>> E.g we may support device resize for virtio-blk devices.
>>
>> But it can be done on top I think.
> I think so too.


Thanks


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-18 17:33                   ` Michael S. Tsirkin
@ 2021-08-19  4:22                     ` Jason Wang
  2021-08-19  5:23                       ` Parav Pandit via Virtualization
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Wang @ 2021-08-19  4:22 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Eli Cohen, virtualization


在 2021/8/19 上午1:33, Michael S. Tsirkin 写道:
> On Wed, Aug 18, 2021 at 12:31:39PM +0800, Jason Wang wrote:
>> On Wed, Aug 18, 2021 at 11:15 AM Parav Pandit <parav@nvidia.com> wrote:
>>>
>>>
>>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>> Sent: Tuesday, August 17, 2021 2:24 AM
>>>>
>>>> On Mon, Aug 09, 2021 at 09:51:49AM +0000, Parav Pandit wrote:
>>>>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>>>> Sent: Monday, August 9, 2021 3:10 PM
>>>>>>
>>>>>> On Fri, Aug 06, 2021 at 08:55:56AM +0000, Parav Pandit wrote:
>>>>>>>
>>>>>>>> The point is to try and not reinvent a dedicated vpda interface
>>>>>>>> where a generic one exits.
>>>>>>>> E.g. for phy things such as mac speed etc, I think most people
>>>>>>>> are using ethtool things right?
>>>>>>> As you know vdpa is the backend device for the front-end netdevice
>>>>>> accessed by the ethtool.
>>>>>>> vdpa management tool here is composing the vdpa device.
>>>>>>>
>>>>>>> For example creator (hypervisor) of the vdpa devices knows that a
>>>>>>> guest VM is given 4 vcpus, So hypervisor creates a vdpa devices
>>>>>>> with config space layout as, max_virtqueue_pairs = 4.
>>>>>>> And the MAC address chosen by hypervisor in mac[6].
>>>>>>>
>>>>>>> Guest VM ethtool can still chose to use less number of channels.
>>>>>>>
>>>>>>> Typically,
>>>>>>> ethtool is for guest VM.
>>>>>>> vdpa device is in hypevisor.
>>>>>>>
>>>>>>> How can hypervisor compose a vdpa device without any tool?
>>>>>>> How can it tell ethtool, what is supported and what are the defaults?
>>>>>>>
>>>>>>> I must be misunderstanding your comment about ethtool.
>>>>>>> Can you please explain?
>>>>>>
>>>>>> I am basically saying that we probably want to be able to change MAC
>>>>>> of a VDPA device on the host without desroying and recreating the
>>>>>> device as long as it's not in use.
>>>>> Ok. I understood your comment now.
>>>>> Yes, this was the objective which is why they are present as independent
>>>> config knob.
>>>>> Jason was suggesting to have them as creation only knobs, which requires
>>>> recreate.
>>>>> I don't have strong opinion for either method.
>>>>>
>>>>> Passing them at creation time is simpler for user.
>>>>> If user needs the ability to modify and reuse same device with different
>>>> config, extending such support in future like this patch should possible.
>>>>> So there are two questions to close.
>>>>> 1. Can we start with config params at vdpa device creation time?
>>>> I'm not sure whether we need both but I'd like to see a full API and I think we
>>>> all agree host wants ability to tweak mac after device creation even if guest is
>>>> not allowed to change mac, right?
>>>>
>>> Yes.
>>> $ vdpa dev add name foo mgmtdev pci/0000:03:00.0 mac 00:11:22:33:44:55 maxvqs 8 mtu 9000
>>>
>>> Above API if we do at creation time. It is likely simpler for user to pass necessary params during creation time.
>>>
>>>>> 2. Is it ok to have these config params as individual fields at netlink U->K
>>>> UAPI level?
>>>>> This is the method proposed in this patch series.
>>>>> (Similar to incrementally growing vxlan ip link command).
>>>>>
>>>>> Or
>>>>> They should be packed in a structure between U-> K and deal with
>>>> typecasting based on size and more?
>>>>> (Jason's input).
>>>> I'm inclined to say vxlan is closer to a model to follow.
>>> Ok. thanks for the feedback. We are using the model close to vxlan.
>>> Lets resolve should we have it at creation time, post creation or both?
>>> (a) Creation time
>>> Pros:
>>> - simpler single api for user
>>> - eliminates needs of inventing stats reset in future series
>>> Cons:
>>> - inability to reuse the device with different config
>> This can be solved by destroying the instance and re-creating it with
>> a different params?
>>
>>> - This may not be of great advantage, and it is probably fine to have creation time params
>>>
>>> (b) post creation time:
>>> Pros:
>>> - able to reuse the device with different config for say different VM.
>>> - will require stats reset in future once stats are implemented
>> Any reason for doing this other than re-creating the device?
> Permissions.


I would expect that CAP_NET_ADMIN is required for both cases.

Or anything I miss here?

Thanks


>
>
>>> Cons:
>>> - more commands for users to config a device, better to have the ability at create time.
>> We probably need to support post creation but it should be device specific.
>>
>> E.g we may support device resize for virtio-blk devices.
>>
>> But it can be done on top I think.
>>
>> Thanks
>>
>>>> --
>>>> MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-19  4:22                     ` Jason Wang
@ 2021-08-19  5:23                       ` Parav Pandit via Virtualization
  2021-08-19  7:15                         ` Jason Wang
  0 siblings, 1 reply; 62+ messages in thread
From: Parav Pandit via Virtualization @ 2021-08-19  5:23 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin; +Cc: Eli Cohen, virtualization



> From: Jason Wang <jasowang@redhat.com>
> Sent: Thursday, August 19, 2021 9:52 AM

> 
> 在 2021/8/19 上午1:33, Michael S. Tsirkin 写道:
> > On Wed, Aug 18, 2021 at 12:31:39PM +0800, Jason Wang wrote:
> >> On Wed, Aug 18, 2021 at 11:15 AM Parav Pandit <parav@nvidia.com>
> wrote:
> >>>
> >>>
> >>>> From: Michael S. Tsirkin <mst@redhat.com>
> >>>> Sent: Tuesday, August 17, 2021 2:24 AM
> >>>>
> >>>> On Mon, Aug 09, 2021 at 09:51:49AM +0000, Parav Pandit wrote:
> >>>>>> From: Michael S. Tsirkin <mst@redhat.com>
> >>>>>> Sent: Monday, August 9, 2021 3:10 PM
> >>>>>>
> >>>>>> On Fri, Aug 06, 2021 at 08:55:56AM +0000, Parav Pandit wrote:
> >>>>>>>
> >>>>>>>> The point is to try and not reinvent a dedicated vpda interface
> >>>>>>>> where a generic one exits.
> >>>>>>>> E.g. for phy things such as mac speed etc, I think most people
> >>>>>>>> are using ethtool things right?
> >>>>>>> As you know vdpa is the backend device for the front-end
> >>>>>>> netdevice
> >>>>>> accessed by the ethtool.
> >>>>>>> vdpa management tool here is composing the vdpa device.
> >>>>>>>
> >>>>>>> For example creator (hypervisor) of the vdpa devices knows that
> >>>>>>> a guest VM is given 4 vcpus, So hypervisor creates a vdpa
> >>>>>>> devices with config space layout as, max_virtqueue_pairs = 4.
> >>>>>>> And the MAC address chosen by hypervisor in mac[6].
> >>>>>>>
> >>>>>>> Guest VM ethtool can still chose to use less number of channels.
> >>>>>>>
> >>>>>>> Typically,
> >>>>>>> ethtool is for guest VM.
> >>>>>>> vdpa device is in hypevisor.
> >>>>>>>
> >>>>>>> How can hypervisor compose a vdpa device without any tool?
> >>>>>>> How can it tell ethtool, what is supported and what are the
> defaults?
> >>>>>>>
> >>>>>>> I must be misunderstanding your comment about ethtool.
> >>>>>>> Can you please explain?
> >>>>>>
> >>>>>> I am basically saying that we probably want to be able to change
> >>>>>> MAC of a VDPA device on the host without desroying and recreating
> >>>>>> the device as long as it's not in use.
> >>>>> Ok. I understood your comment now.
> >>>>> Yes, this was the objective which is why they are present as
> >>>>> independent
> >>>> config knob.
> >>>>> Jason was suggesting to have them as creation only knobs, which
> >>>>> requires
> >>>> recreate.
> >>>>> I don't have strong opinion for either method.
> >>>>>
> >>>>> Passing them at creation time is simpler for user.
> >>>>> If user needs the ability to modify and reuse same device with
> >>>>> different
> >>>> config, extending such support in future like this patch should possible.
> >>>>> So there are two questions to close.
> >>>>> 1. Can we start with config params at vdpa device creation time?
> >>>> I'm not sure whether we need both but I'd like to see a full API
> >>>> and I think we all agree host wants ability to tweak mac after
> >>>> device creation even if guest is not allowed to change mac, right?
> >>>>
> >>> Yes.
> >>> $ vdpa dev add name foo mgmtdev pci/0000:03:00.0 mac
> >>> 00:11:22:33:44:55 maxvqs 8 mtu 9000
> >>>
> >>> Above API if we do at creation time. It is likely simpler for user to pass
> necessary params during creation time.
> >>>
> >>>>> 2. Is it ok to have these config params as individual fields at
> >>>>> netlink U->K
> >>>> UAPI level?
> >>>>> This is the method proposed in this patch series.
> >>>>> (Similar to incrementally growing vxlan ip link command).
> >>>>>
> >>>>> Or
> >>>>> They should be packed in a structure between U-> K and deal with
> >>>> typecasting based on size and more?
> >>>>> (Jason's input).
> >>>> I'm inclined to say vxlan is closer to a model to follow.
> >>> Ok. thanks for the feedback. We are using the model close to vxlan.
> >>> Lets resolve should we have it at creation time, post creation or both?
> >>> (a) Creation time
> >>> Pros:
> >>> - simpler single api for user
> >>> - eliminates needs of inventing stats reset in future series
> >>> Cons:
> >>> - inability to reuse the device with different config
> >> This can be solved by destroying the instance and re-creating it with
> >> a different params?
> >>
> >>> - This may not be of great advantage, and it is probably fine to
> >>> have creation time params
> >>>
> >>> (b) post creation time:
> >>> Pros:
> >>> - able to reuse the device with different config for say different VM.
> >>> - will require stats reset in future once stats are implemented
> >> Any reason for doing this other than re-creating the device?
> > Permissions.
> 
> 
> I would expect that CAP_NET_ADMIN is required for both cases.

Correct. Patch-3 in this series has the code for CAP_NET_ADMIN for setting the mac, snippet below.
For vdpa net device addition we do not have the check yet.

You/Michael mentioned that QEMU runs without any permissions in some other thread.
Do you mean QEMU can run without these capabilities?
If yes, is it fair ask for non QEMU sw to setup the vdpa device which has the higher capabilities than QEMU and after that QEMU runs with lower capabilities?

+static int vdpa_dev_net_config_set(struct vdpa_device *vdev,
+				   struct sk_buff *skb, struct genl_info *info) {
+	struct nlattr **nl_attrs = info->attrs;
+	struct vdpa_dev_set_config config = {};
+	const u8 *macaddr;
+	int err;
+
+	if (!netlink_capable(skb, CAP_NET_ADMIN))
+		return -EPERM;
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu
  2021-08-19  5:23                       ` Parav Pandit via Virtualization
@ 2021-08-19  7:15                         ` Jason Wang
  0 siblings, 0 replies; 62+ messages in thread
From: Jason Wang @ 2021-08-19  7:15 UTC (permalink / raw)
  To: Parav Pandit; +Cc: Eli Cohen, virtualization, Michael S. Tsirkin

On Thu, Aug 19, 2021 at 1:23 PM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Thursday, August 19, 2021 9:52 AM
>
> >
> > 在 2021/8/19 上午1:33, Michael S. Tsirkin 写道:
> > > On Wed, Aug 18, 2021 at 12:31:39PM +0800, Jason Wang wrote:
> > >> On Wed, Aug 18, 2021 at 11:15 AM Parav Pandit <parav@nvidia.com>
> > wrote:
> > >>>
> > >>>
> > >>>> From: Michael S. Tsirkin <mst@redhat.com>
> > >>>> Sent: Tuesday, August 17, 2021 2:24 AM
> > >>>>
> > >>>> On Mon, Aug 09, 2021 at 09:51:49AM +0000, Parav Pandit wrote:
> > >>>>>> From: Michael S. Tsirkin <mst@redhat.com>
> > >>>>>> Sent: Monday, August 9, 2021 3:10 PM
> > >>>>>>
> > >>>>>> On Fri, Aug 06, 2021 at 08:55:56AM +0000, Parav Pandit wrote:
> > >>>>>>>
> > >>>>>>>> The point is to try and not reinvent a dedicated vpda interface
> > >>>>>>>> where a generic one exits.
> > >>>>>>>> E.g. for phy things such as mac speed etc, I think most people
> > >>>>>>>> are using ethtool things right?
> > >>>>>>> As you know vdpa is the backend device for the front-end
> > >>>>>>> netdevice
> > >>>>>> accessed by the ethtool.
> > >>>>>>> vdpa management tool here is composing the vdpa device.
> > >>>>>>>
> > >>>>>>> For example creator (hypervisor) of the vdpa devices knows that
> > >>>>>>> a guest VM is given 4 vcpus, So hypervisor creates a vdpa
> > >>>>>>> devices with config space layout as, max_virtqueue_pairs = 4.
> > >>>>>>> And the MAC address chosen by hypervisor in mac[6].
> > >>>>>>>
> > >>>>>>> Guest VM ethtool can still chose to use less number of channels.
> > >>>>>>>
> > >>>>>>> Typically,
> > >>>>>>> ethtool is for guest VM.
> > >>>>>>> vdpa device is in hypevisor.
> > >>>>>>>
> > >>>>>>> How can hypervisor compose a vdpa device without any tool?
> > >>>>>>> How can it tell ethtool, what is supported and what are the
> > defaults?
> > >>>>>>>
> > >>>>>>> I must be misunderstanding your comment about ethtool.
> > >>>>>>> Can you please explain?
> > >>>>>>
> > >>>>>> I am basically saying that we probably want to be able to change
> > >>>>>> MAC of a VDPA device on the host without desroying and recreating
> > >>>>>> the device as long as it's not in use.
> > >>>>> Ok. I understood your comment now.
> > >>>>> Yes, this was the objective which is why they are present as
> > >>>>> independent
> > >>>> config knob.
> > >>>>> Jason was suggesting to have them as creation only knobs, which
> > >>>>> requires
> > >>>> recreate.
> > >>>>> I don't have strong opinion for either method.
> > >>>>>
> > >>>>> Passing them at creation time is simpler for user.
> > >>>>> If user needs the ability to modify and reuse same device with
> > >>>>> different
> > >>>> config, extending such support in future like this patch should possible.
> > >>>>> So there are two questions to close.
> > >>>>> 1. Can we start with config params at vdpa device creation time?
> > >>>> I'm not sure whether we need both but I'd like to see a full API
> > >>>> and I think we all agree host wants ability to tweak mac after
> > >>>> device creation even if guest is not allowed to change mac, right?
> > >>>>
> > >>> Yes.
> > >>> $ vdpa dev add name foo mgmtdev pci/0000:03:00.0 mac
> > >>> 00:11:22:33:44:55 maxvqs 8 mtu 9000
> > >>>
> > >>> Above API if we do at creation time. It is likely simpler for user to pass
> > necessary params during creation time.
> > >>>
> > >>>>> 2. Is it ok to have these config params as individual fields at
> > >>>>> netlink U->K
> > >>>> UAPI level?
> > >>>>> This is the method proposed in this patch series.
> > >>>>> (Similar to incrementally growing vxlan ip link command).
> > >>>>>
> > >>>>> Or
> > >>>>> They should be packed in a structure between U-> K and deal with
> > >>>> typecasting based on size and more?
> > >>>>> (Jason's input).
> > >>>> I'm inclined to say vxlan is closer to a model to follow.
> > >>> Ok. thanks for the feedback. We are using the model close to vxlan.
> > >>> Lets resolve should we have it at creation time, post creation or both?
> > >>> (a) Creation time
> > >>> Pros:
> > >>> - simpler single api for user
> > >>> - eliminates needs of inventing stats reset in future series
> > >>> Cons:
> > >>> - inability to reuse the device with different config
> > >> This can be solved by destroying the instance and re-creating it with
> > >> a different params?
> > >>
> > >>> - This may not be of great advantage, and it is probably fine to
> > >>> have creation time params
> > >>>
> > >>> (b) post creation time:
> > >>> Pros:
> > >>> - able to reuse the device with different config for say different VM.
> > >>> - will require stats reset in future once stats are implemented
> > >> Any reason for doing this other than re-creating the device?
> > > Permissions.
> >
> >
> > I would expect that CAP_NET_ADMIN is required for both cases.
>
> Correct. Patch-3 in this series has the code for CAP_NET_ADMIN for setting the mac, snippet below.
> For vdpa net device addition we do not have the check yet.
>
> You/Michael mentioned that QEMU runs without any permissions in some other thread.
> Do you mean QEMU can run without these capabilities?

Yes.

> If yes, is it fair ask for non QEMU sw to setup the vdpa device which has the higher capabilities than QEMU and after that QEMU runs with lower capabilities?

Right, e.g it's the charge of libvirt or other privileged process to
those kind of configuration.

So I don't see how it differs from device creation from the view of permission.

Thanks

>
> +static int vdpa_dev_net_config_set(struct vdpa_device *vdev,
> +                                  struct sk_buff *skb, struct genl_info *info) {
> +       struct nlattr **nl_attrs = info->attrs;
> +       struct vdpa_dev_set_config config = {};
> +       const u8 *macaddr;
> +       int err;
> +
> +       if (!netlink_capable(skb, CAP_NET_ADMIN))
> +               return -EPERM;
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2021-08-19  7:15 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-16 19:11 [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu Parav Pandit
2021-06-16 19:11 ` [PATCH linux-next v3 1/6] vdpa: Introduce and use vdpa device get, set config helpers Parav Pandit
2021-06-22  7:08   ` Jason Wang
2021-06-16 19:11 ` [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout Parav Pandit
2021-06-22  7:20   ` Jason Wang
2021-06-22 14:03     ` Parav Pandit
2021-06-23  4:08       ` Jason Wang
2021-06-23  4:22         ` Parav Pandit
2021-06-24  5:43           ` Jason Wang
2021-06-24  6:29             ` Parav Pandit
2021-06-24  7:05               ` Jason Wang
2021-06-24  7:59                 ` Parav Pandit
2021-06-25  3:28                   ` Jason Wang
2021-06-25  6:45                     ` Parav Pandit
2021-06-28  5:03                       ` Jason Wang
2021-06-28 10:56                         ` Parav Pandit
2021-06-29  3:52                           ` Jason Wang
2021-06-29  9:49                             ` Parav Pandit
2021-06-30  4:31                               ` Jason Wang
2021-06-30  6:03                                 ` Parav Pandit
2021-07-01  3:34                                   ` Jason Wang
2021-07-01  7:00                                     ` Parav Pandit
2021-07-01  7:43                                       ` Jason Wang
2021-07-02  6:04                                         ` Parav Pandit
2021-07-05  4:35                                           ` Jason Wang
2021-07-06 17:07                                             ` Parav Pandit
2021-07-07  4:03                                               ` Jason Wang
2021-06-28 22:39                         ` Michael S. Tsirkin
2021-06-29  3:41                           ` Jason Wang
2021-06-29 20:01                             ` Michael S. Tsirkin
2021-06-30  3:46                               ` Jason Wang
2021-06-16 19:11 ` [PATCH linux-next v3 3/6] vdpa: Enable user to set mac and mtu of vdpa device Parav Pandit
2021-06-22  7:43   ` Jason Wang
2021-06-22 14:09     ` Parav Pandit
2021-06-16 19:11 ` [PATCH linux-next v3 4/6] vdpa_sim_net: Enable user to set mac address and mtu Parav Pandit
2021-06-16 19:11 ` [PATCH linux-next v3 5/6] vdpa/mlx5: Support configuration of MAC Parav Pandit
2021-06-16 19:11 ` [PATCH linux-next v3 6/6] vdpa/mlx5: Forward only packets with allowed MAC address Parav Pandit
2021-08-05  9:57 ` [PATCH linux-next v3 0/6] vdpa: enable user to set mac, mtu Michael S. Tsirkin
2021-08-05 10:13   ` Parav Pandit via Virtualization
2021-08-05 12:05     ` Michael S. Tsirkin
2021-08-06  2:50   ` Jason Wang
2021-08-06  8:42     ` Michael S. Tsirkin
2021-08-06  8:55       ` Parav Pandit via Virtualization
2021-08-09  3:07         ` Jason Wang
2021-08-09  3:13           ` Parav Pandit via Virtualization
2021-08-09  3:29             ` Jason Wang
     [not found]           ` <20210809052121.GA209158@mtl-vdi-166.wap.labs.mlnx>
2021-08-09  5:42             ` Parav Pandit via Virtualization
     [not found]               ` <20210809055748.GA210406@mtl-vdi-166.wap.labs.mlnx>
2021-08-09  6:01                 ` Parav Pandit via Virtualization
     [not found]                   ` <20210809060746.GA210718@mtl-vdi-166.wap.labs.mlnx>
2021-08-09  6:10                     ` Parav Pandit via Virtualization
2021-08-09  7:05                       ` Jason Wang
2021-08-16 20:51                         ` Michael S. Tsirkin
2021-08-09  9:40         ` Michael S. Tsirkin
2021-08-09  9:51           ` Parav Pandit via Virtualization
2021-08-16 20:54             ` Michael S. Tsirkin
2021-08-18  3:14               ` Parav Pandit via Virtualization
2021-08-18  4:31                 ` Jason Wang
2021-08-18  4:36                   ` Parav Pandit via Virtualization
2021-08-19  4:18                     ` Jason Wang
2021-08-18 17:33                   ` Michael S. Tsirkin
2021-08-19  4:22                     ` Jason Wang
2021-08-19  5:23                       ` Parav Pandit via Virtualization
2021-08-19  7:15                         ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.