Netdev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/7] Introduce vdpa management tool
@ 2020-11-12  6:39 Parav Pandit
  2020-11-12  6:39 ` [PATCH 1/7] vdpa: Add missing comment for virtqueue count Parav Pandit
                   ` (12 more replies)
  0 siblings, 13 replies; 79+ messages in thread
From: Parav Pandit @ 2020-11-12  6:39 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

This patchset covers user requirements for managing existing vdpa devices,
using a tool and its internal design notes for kernel drivers.

Background and user requirements:
----------------------------------
(1) Currently VDPA device is created by driver when driver is loaded.
However, user should have a choice when to create or not create a vdpa device
for the underlying parent device.

For example, mlx5 PCI VF and subfunction device supports multiple classes of
device such netdev, vdpa, rdma. Howevever it is not required to always created
vdpa device for such device.

(2) In another use case, a device may support creating one or multiple vdpa
device of same or different class such as net and block.
Creating vdpa devices at driver load time further limits this use case.

(3) A user should be able to monitor and query vdpa queue level or device level
statistics for a given vdpa device.

(4) A user should be able to query what class of vdpa devices are supported
by its parent device.

(5) A user should be able to view supported features and negotiated features
of the vdpa device.

(6) A user should be able to create a vdpa device in vendor agnostic manner
using single tool.

Hence, it is required to have a tool through which user can create one or more
vdpa devices from a parent device which addresses above user requirements.

Example devices:
----------------
 +-----------+ +-----------+ +---------+ +--------+ +-----------+ 
 |vdpa dev 0 | |vdpa dev 1 | |rdma dev | |netdev  | |vdpa dev 3 |
 |type=net   | |type=block | |mlx5_0   | |ens3f0  | |type=net   |
 +----+------+ +-----+-----+ +----+----+ +-----+--+ +----+------+
      |              |            |            |         |
      |              |            |            |         |
 +----+-----+        |       +----+----+       |    +----+----+
 |  mlx5    +--------+       |mlx5     +-------+    |mlx5     |
 |pci vf 2  |                |pci vf 4 |            |pci sf 8 |
 |03:00:2   |                |03:00.4  |            |mlx5_sf.8|
 +----+-----+                +----+----+            +----+----+
      |                           |                      |
      |                      +----+-----+                |
      +----------------------+mlx5      +----------------+
                             |pci pf 0  |
                             |03:00.0   |
                             +----------+

vdpa tool:
----------
vdpa tool is a tool to create, delete vdpa devices from a parent device. It is a
tool that enables user to query statistics, features and may be more attributes
in future.

vdpa tool command draft:
------------------------
(a) List parent devices which supports creating vdpa devices.
It also shows which class types supported by this parent device.
In below command example two parent devices support vdpa device creation.
First is PCI VF whose bdf is 03.00:2.
Second is PCI VF whose name is 03:00.4.
Third is PCI SF whose name is mlx5_core.sf.8

$ vdpa parentdev list
vdpasim
  supported_classes
    net
pci/0000:03.00:3
  supported_classes
    net block
pci/0000:03.00:4
  supported_classes
    net block
auxiliary/mlx5_core.sf.8
  supported_classes
    net

(b) Now add a vdpa device of networking class and show the device.
$ vdpa dev add parentdev pci/0000:03.00:2 type net name foo0 $ vdpa dev show foo0
foo0: parentdev pci/0000:03.00:2 type network parentdev vdpasim vendor_id 0 max_vqs 2 max_vq_size 256

(c) Show features of a vdpa device
$ vdpa dev features show foo0
supported
  iommu platform
  version 1

(d) Dump vdpa device statistics
$ vdpa dev stats show foo0
kickdoorbells 10
wqes 100

(e) Now delete a vdpa device previously created.
$ vdpa dev del foo0

vdpa tool support in this patchset:
-----------------------------------
vdpa tool is created to create, delete and query vdpa devices.
examples:
Show vdpa parent device that supports creating, deleting vdpa devices.

$ vdpa parentdev show
vdpasim:
  supported_classes
    net

$ vdpa parentdev show -jp
{
    "show": {
       "vdpasim": {
          "supported_classes": {
             "net"
        }
    }
}

Create a vdpa device of type networking named as "foo2" from the parent device vdpasim:

$ vdpa dev add parentdev vdpasim type net name foo2

Show the newly created vdpa device by its name:
$ vdpa dev show foo2
foo2: type network parentdev vdpasim vendor_id 0 max_vqs 2 max_vq_size 256

$ vdpa dev show foo2 -jp
{
    "dev": {
        "foo2": {
            "type": "network",
            "parentdev": "vdpasim",
            "vendor_id": 0,
            "max_vqs": 2,
            "max_vq_size": 256
        }
    }
}

Delete the vdpa device after its use:
$ vdpa dev del foo2

vdpa tool support by kernel:
----------------------------
vdpa tool user interface will be supported by existing vdpa kernel framework,
i.e. drivers/vdpa/vdpa.c It services user command through a netlink interface.

Each parent device registers supported callback operations with vdpa subsystem
through which vdpa device(s) can be managed.

FAQs:
-----
1. Where does userspace vdpa tool reside which users can use?
Ans: vdpa tool can possibly reside in iproute2 [1] as it enables user to
create vdpa net devices.

2. Why not create and delete vdpa device using sysfs/configfs?
Ans:
(a) A device creation may involve passing one or more attributes.
Passing multiple attributes and returning error code and more verbose
information for invalid attributes cannot be handled by sysfs/configfs.

(b) netlink framework is rich that enables user space and kernel driver to
provide nested attributes.

(c) Exposing device specific file under sysfs without net namespace
awareness exposes details to multiple containers. Instead exposing
attributes via a netlink socket secures the communication channel with kernel.

(d) netlink socket interface enables to run syscaller kernel tests.

3. Why not use ioctl() interface?
Ans: ioctl() interface replicates the necessary plumbing which already
exists through netlink socket.

4. What happens when one or more user created vdpa devices exist for a
parent PCI VF or SF and such parent device is removed?
Ans: All user created vdpa devices are removed that belong to a parent.

[1] git://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git

Next steps:
-----------
(a) Post this patchset and iproute2/vdpa inclusion, remaining two drivers
will be coverted to support vdpa tool instead of creating unmanaged default
device on driver load.
(b) More net specific parameters such as mac, mtu will be added.
(c) Features bits get and set interface will be added.

Parav Pandit (7):
  vdpa: Add missing comment for virtqueue count
  vdpa: Use simpler version of ida allocation
  vdpa: Extend routine to accept vdpa device name
  vdpa: Define vdpa parent device, ops and a netlink interface
  vdpa: Enable a user to add and delete a vdpa device
  vdpa: Enable user to query vdpa device info
  vdpa/vdpa_sim: Enable user to create vdpasim net devices

 drivers/vdpa/Kconfig              |   1 +
 drivers/vdpa/ifcvf/ifcvf_main.c   |   2 +-
 drivers/vdpa/mlx5/net/mlx5_vnet.c |   2 +-
 drivers/vdpa/vdpa.c               | 511 +++++++++++++++++++++++++++++-
 drivers/vdpa/vdpa_sim/vdpa_sim.c  |  81 ++++-
 include/linux/vdpa.h              |  46 ++-
 include/uapi/linux/vdpa.h         |  41 +++
 7 files changed, 660 insertions(+), 24 deletions(-)
 create mode 100644 include/uapi/linux/vdpa.h

-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 1/7] vdpa: Add missing comment for virtqueue count
  2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
@ 2020-11-12  6:39 ` Parav Pandit
  2020-11-12  6:40 ` [PATCH 2/7] vdpa: Use simpler version of ida allocation Parav Pandit
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2020-11-12  6:39 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

Add missing comment for number of virtqueue.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/vdpa.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 30bc7a7223bb..0fefeb976877 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -42,6 +42,7 @@ struct vdpa_vq_state {
  * @config: the configuration ops for this device.
  * @index: device index
  * @features_valid: were features initialized? for legacy guests
+ * @nvqs: maximum number of supported virtqueues
  */
 struct vdpa_device {
 	struct device dev;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 2/7] vdpa: Use simpler version of ida allocation
  2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
  2020-11-12  6:39 ` [PATCH 1/7] vdpa: Add missing comment for virtqueue count Parav Pandit
@ 2020-11-12  6:40 ` Parav Pandit
  2020-11-12  6:40 ` [PATCH 3/7] vdpa: Extend routine to accept vdpa device name Parav Pandit
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2020-11-12  6:40 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

vdpa doesn't have any specific need to define start and end range of the
device index.
Hence use the simper version of the ida allocator.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vdpa/vdpa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index a69ffc991e13..c0825650c055 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -89,7 +89,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 	if (!vdev)
 		goto err;
 
-	err = ida_simple_get(&vdpa_index_ida, 0, 0, GFP_KERNEL);
+	err = ida_alloc(&vdpa_index_ida, GFP_KERNEL);
 	if (err < 0)
 		goto err_ida;
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 3/7] vdpa: Extend routine to accept vdpa device name
  2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
  2020-11-12  6:39 ` [PATCH 1/7] vdpa: Add missing comment for virtqueue count Parav Pandit
  2020-11-12  6:40 ` [PATCH 2/7] vdpa: Use simpler version of ida allocation Parav Pandit
@ 2020-11-12  6:40 ` Parav Pandit
  2020-11-12  6:40 ` [PATCH 4/7] vdpa: Define vdpa parent device, ops and a netlink interface Parav Pandit
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2020-11-12  6:40 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

In a subsequent patch, when user initiated command creates a vdpa device,
the user chooses the name of the vdpa device.
To support it, extend the device allocation API to consider this name
specified by the caller driver.

Split the device unregistration to device delete and device put so that
device can be removed from the list after its deleted.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vdpa/ifcvf/ifcvf_main.c   |  2 +-
 drivers/vdpa/mlx5/net/mlx5_vnet.c |  2 +-
 drivers/vdpa/vdpa.c               | 36 +++++++++++++++++++++++++++----
 drivers/vdpa/vdpa_sim/vdpa_sim.c  |  2 +-
 include/linux/vdpa.h              |  7 +++---
 5 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index 8b4028556cb6..23474af7da40 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -439,7 +439,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
 				    dev, &ifc_vdpa_ops,
-				    IFCVF_MAX_QUEUE_PAIRS * 2);
+				    IFCVF_MAX_QUEUE_PAIRS * 2, NULL);
 	if (adapter == NULL) {
 		IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
 		return -ENOMEM;
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 5316e51e72d4..cf9fc51071c8 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1958,7 +1958,7 @@ static int mlx5v_probe(struct auxiliary_device *adev,
 	max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);
 
 	ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
-				 2 * mlx5_vdpa_max_qps(max_vqs));
+				 2 * mlx5_vdpa_max_qps(max_vqs), NULL);
 	if (IS_ERR(ndev))
 		return PTR_ERR(ndev);
 
diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index c0825650c055..3c9cade05233 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -12,6 +12,8 @@
 #include <linux/slab.h>
 #include <linux/vdpa.h>
 
+/* A global mutex that protects vdpa parent and device level operations. */
+static DEFINE_MUTEX(vdpa_dev_mutex);
 static DEFINE_IDA(vdpa_index_ida);
 
 static int vdpa_dev_probe(struct device *d)
@@ -63,6 +65,7 @@ static void vdpa_release_dev(struct device *d)
  * @config: the bus operations that is supported by this device
  * @nvqs: number of virtqueues supported by this device
  * @size: size of the parent structure that contains private data
+ * @name: name of the vdpa device; optional.
  *
  * Driver should use vdpa_alloc_device() wrapper macro instead of
  * using this directly.
@@ -72,8 +75,7 @@ static void vdpa_release_dev(struct device *d)
  */
 struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 					const struct vdpa_config_ops *config,
-					int nvqs,
-					size_t size)
+					int nvqs, size_t size, const char *name)
 {
 	struct vdpa_device *vdev;
 	int err = -EINVAL;
@@ -101,7 +103,10 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 	vdev->features_valid = false;
 	vdev->nvqs = nvqs;
 
-	err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
+	if (name)
+		err = dev_set_name(&vdev->dev, "%s", name);
+	else
+		err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
 	if (err)
 		goto err_name;
 
@@ -118,6 +123,13 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 }
 EXPORT_SYMBOL_GPL(__vdpa_alloc_device);
 
+static int vdpa_name_match(struct device *dev, const void *data)
+{
+	struct vdpa_device *vdev = container_of(dev, struct vdpa_device, dev);
+
+	return (strcmp(dev_name(&vdev->dev), data) == 0);
+}
+
 /**
  * vdpa_register_device - register a vDPA device
  * Callers must have a succeed call of vdpa_alloc_device() before.
@@ -127,7 +139,21 @@ EXPORT_SYMBOL_GPL(__vdpa_alloc_device);
  */
 int vdpa_register_device(struct vdpa_device *vdev)
 {
-	return device_add(&vdev->dev);
+	struct device *dev;
+	int err;
+
+	mutex_lock(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, dev_name(&vdev->dev), vdpa_name_match);
+	if (dev) {
+		put_device(dev);
+		err = -EEXIST;
+		goto name_err;
+	}
+
+	err = device_add(&vdev->dev);
+name_err:
+	mutex_unlock(&vdpa_dev_mutex);
+	return err;
 }
 EXPORT_SYMBOL_GPL(vdpa_register_device);
 
@@ -137,7 +163,9 @@ EXPORT_SYMBOL_GPL(vdpa_register_device);
  */
 void vdpa_unregister_device(struct vdpa_device *vdev)
 {
+	mutex_lock(&vdpa_dev_mutex);
 	device_unregister(&vdev->dev);
+	mutex_unlock(&vdpa_dev_mutex);
 }
 EXPORT_SYMBOL_GPL(vdpa_unregister_device);
 
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 6a90fdb9cbfc..aed1bb7770ab 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -357,7 +357,7 @@ static struct vdpasim *vdpasim_create(void)
 	else
 		ops = &vdpasim_net_config_ops;
 
-	vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, VDPASIM_VQ_NUM);
+	vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, VDPASIM_VQ_NUM, NULL);
 	if (!vdpasim)
 		goto err_alloc;
 
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 0fefeb976877..5700baa22356 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -245,15 +245,14 @@ struct vdpa_config_ops {
 
 struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 					const struct vdpa_config_ops *config,
-					int nvqs,
-					size_t size);
+					int nvqs, size_t size, const char *name);
 
-#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs)   \
+#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, name)   \
 			  container_of(__vdpa_alloc_device( \
 				       parent, config, nvqs, \
 				       sizeof(dev_struct) + \
 				       BUILD_BUG_ON_ZERO(offsetof( \
-				       dev_struct, member))), \
+				       dev_struct, member)), name), \
 				       dev_struct, member)
 
 int vdpa_register_device(struct vdpa_device *vdev);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 4/7] vdpa: Define vdpa parent device, ops and a netlink interface
  2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
                   ` (2 preceding siblings ...)
  2020-11-12  6:40 ` [PATCH 3/7] vdpa: Extend routine to accept vdpa device name Parav Pandit
@ 2020-11-12  6:40 ` Parav Pandit
  2020-11-12  6:40 ` [PATCH 5/7] vdpa: Enable a user to add and delete a vdpa device Parav Pandit
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2020-11-12  6:40 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

To add one or more VDPA devices, define a parent device which allows
adding or removing vdpa device. A parent device defines set of callbacks
to manage vdpa devices.

To begin with, it defines add and remove callbacks through which a user
defined vdpa device can be added or removed.

A unique parent device is identified by its unique handle identified by
parent device name and optionally the bus name.

Hence, introduce routine through which driver can register a
parent device and its callback operations for adding and remove
a vdpa device.

Introduce vdpa netlink socket family so that user can query parent
device and its attributes.

Example of show vdpa parent device which allows creating vdpa device of
networking class (device id = 0x1) of virtio specification 1.1
section 5.1.1.

$ vdpa parentdev show
vdpasim:
  supported_classes:
    net

Example of showing vdpa parent device in JSON format.

$ vdpa parentdev show -jp
{
    "show": {
        "vdpasim": {
            "supported_classes": [ "net" ]
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vdpa/Kconfig      |   1 +
 drivers/vdpa/vdpa.c       | 213 +++++++++++++++++++++++++++++++++++++-
 include/linux/vdpa.h      |  32 ++++++
 include/uapi/linux/vdpa.h |  32 ++++++
 4 files changed, 277 insertions(+), 1 deletion(-)
 create mode 100644 include/uapi/linux/vdpa.h

diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index d7d32b656102..8ae491a74ebb 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 menuconfig VDPA
 	tristate "vDPA drivers"
+	depends on NET
 	help
 	  Enable this module to support vDPA device that uses a
 	  datapath which complies with virtio specifications with
diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 3c9cade05233..273639038851 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -11,11 +11,17 @@
 #include <linux/idr.h>
 #include <linux/slab.h>
 #include <linux/vdpa.h>
+#include <uapi/linux/vdpa.h>
+#include <net/genetlink.h>
+#include <linux/mod_devicetable.h>
 
+static LIST_HEAD(pdev_head);
 /* A global mutex that protects vdpa parent and device level operations. */
 static DEFINE_MUTEX(vdpa_dev_mutex);
 static DEFINE_IDA(vdpa_index_ida);
 
+static struct genl_family vdpa_nl_family;
+
 static int vdpa_dev_probe(struct device *d)
 {
 	struct vdpa_device *vdev = dev_to_vdpa(d);
@@ -195,13 +201,218 @@ void vdpa_unregister_driver(struct vdpa_driver *drv)
 }
 EXPORT_SYMBOL_GPL(vdpa_unregister_driver);
 
+/**
+ * vdpa_parentdev_register - register a vdpa parent device
+ *
+ * @pdev: Pointer to vdpa parent device
+ * vdpa_parentdev_register() register a vdpa parent device which supports
+ * vdpa device management.
+ */
+int vdpa_parentdev_register(struct vdpa_parent_dev *pdev)
+{
+	if (!pdev->device || !pdev->ops || !pdev->ops->dev_add || !pdev->ops->dev_del)
+		return -EINVAL;
+
+	INIT_LIST_HEAD(&pdev->list);
+	mutex_lock(&vdpa_dev_mutex);
+	list_add_tail(&pdev->list, &pdev_head);
+	mutex_unlock(&vdpa_dev_mutex);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vdpa_parentdev_register);
+
+void vdpa_parentdev_unregister(struct vdpa_parent_dev *pdev)
+{
+	mutex_lock(&vdpa_dev_mutex);
+	list_del(&pdev->list);
+	mutex_unlock(&vdpa_dev_mutex);
+}
+EXPORT_SYMBOL_GPL(vdpa_parentdev_unregister);
+
+static bool parentdev_handle_match(const struct vdpa_parent_dev *pdev,
+				   const char *busname, const char *devname)
+{
+	/* Bus name is optional for simulated parent device, so ignore the parent
+	 * when bus is provided.
+	 */
+	if ((busname && !pdev->device->bus) || (!busname && pdev->device->bus))
+		return false;
+
+	if (!busname && strcmp(dev_name(pdev->device), devname) == 0)
+		return true;
+
+	if (busname && (strcmp(pdev->device->bus->name, busname) == 0) &&
+	    (strcmp(dev_name(pdev->device), devname) == 0))
+		return true;
+
+	return false;
+}
+
+static struct vdpa_parent_dev *vdpa_parentdev_get_from_attr(struct nlattr **attrs)
+{
+	struct vdpa_parent_dev *pdev;
+	const char *busname = NULL;
+	const char *devname;
+
+	if (!attrs[VDPA_ATTR_PARENTDEV_DEV_NAME])
+		return ERR_PTR(-EINVAL);
+	devname = nla_data(attrs[VDPA_ATTR_PARENTDEV_DEV_NAME]);
+	if (attrs[VDPA_ATTR_PARENTDEV_BUS_NAME])
+		busname = nla_data(attrs[VDPA_ATTR_PARENTDEV_BUS_NAME]);
+
+	list_for_each_entry(pdev, &pdev_head, list) {
+		if (parentdev_handle_match(pdev, busname, devname))
+			return pdev;
+	}
+	return ERR_PTR(-ENODEV);
+}
+
+static int vdpa_nl_parentdev_handle_fill(struct sk_buff *msg, const struct vdpa_parent_dev *pdev)
+{
+	if (pdev->device->bus &&
+	    nla_put_string(msg, VDPA_ATTR_PARENTDEV_BUS_NAME, pdev->device->bus->name))
+		return -EMSGSIZE;
+	if (nla_put_string(msg, VDPA_ATTR_PARENTDEV_DEV_NAME, dev_name(pdev->device)))
+		return -EMSGSIZE;
+	return 0;
+}
+
+static int vdpa_parentdev_fill(const struct vdpa_parent_dev *pdev, struct sk_buff *msg,
+			       u32 portid, u32 seq, int flags)
+{
+	u64 supported_classes = 0;
+	void *hdr;
+	int i = 0;
+	int err;
+
+	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags, VDPA_CMD_PARENTDEV_NEW);
+	if (!hdr)
+		return -EMSGSIZE;
+	err = vdpa_nl_parentdev_handle_fill(msg, pdev);
+	if (err)
+		goto msg_err;
+
+	while (pdev->id_table[i].device) {
+		supported_classes |= BIT(pdev->id_table[i].device);
+		i++;
+	}
+
+	if (nla_put_u64_64bit(msg, VDPA_ATTR_PARENTDEV_SUPPORTED_CLASSES,
+			      supported_classes, VDPA_ATTR_UNSPEC)) {
+		err = -EMSGSIZE;
+		goto msg_err;
+	}
+
+	genlmsg_end(msg, hdr);
+	return 0;
+
+msg_err:
+	genlmsg_cancel(msg, hdr);
+	return err;
+}
+
+static int vdpa_nl_cmd_parentdev_get_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_parent_dev *pdev;
+	struct sk_buff *msg;
+	int err;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	mutex_lock(&vdpa_dev_mutex);
+	pdev = vdpa_parentdev_get_from_attr(info->attrs);
+	if (IS_ERR(pdev)) {
+		mutex_unlock(&vdpa_dev_mutex);
+		NL_SET_ERR_MSG_MOD(info->extack, "Fail to find the specified parent device");
+		err = PTR_ERR(pdev);
+		goto out;
+	}
+
+	err = vdpa_parentdev_fill(pdev, msg, info->snd_portid, info->snd_seq, 0);
+	mutex_unlock(&vdpa_dev_mutex);
+	if (err)
+		goto out;
+	err = genlmsg_reply(msg, info);
+	return err;
+
+out:
+	nlmsg_free(msg);
+	return err;
+}
+
+static int
+vdpa_nl_cmd_parentdev_get_dumpit(struct sk_buff *msg, struct netlink_callback *cb)
+{
+	struct vdpa_parent_dev *pdev;
+	int start = cb->args[0];
+	int idx = 0;
+	int err;
+
+	mutex_lock(&vdpa_dev_mutex);
+	list_for_each_entry(pdev, &pdev_head, list) {
+		if (idx < start) {
+			idx++;
+			continue;
+		}
+		err = vdpa_parentdev_fill(pdev, msg, NETLINK_CB(cb->skb).portid,
+					  cb->nlh->nlmsg_seq, NLM_F_MULTI);
+		if (err)
+			goto out;
+		idx++;
+	}
+out:
+	mutex_unlock(&vdpa_dev_mutex);
+	cb->args[0] = idx;
+	return msg->len;
+}
+
+static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX] = {
+	[VDPA_ATTR_PARENTDEV_BUS_NAME] = { .type = NLA_NUL_STRING },
+	[VDPA_ATTR_PARENTDEV_DEV_NAME] = { .type = NLA_STRING },
+};
+
+static const struct genl_ops vdpa_nl_ops[] = {
+	{
+		.cmd = VDPA_CMD_PARENTDEV_GET,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_parentdev_get_doit,
+		.dumpit = vdpa_nl_cmd_parentdev_get_dumpit,
+	},
+};
+
+static struct genl_family vdpa_nl_family __ro_after_init = {
+	.name = VDPA_GENL_NAME,
+	.version = VDPA_GENL_VERSION,
+	.maxattr = VDPA_ATTR_MAX,
+	.policy = vdpa_nl_policy,
+	.netnsok = false,
+	.module = THIS_MODULE,
+	.ops = vdpa_nl_ops,
+	.n_ops = ARRAY_SIZE(vdpa_nl_ops),
+};
+
 static int vdpa_init(void)
 {
-	return bus_register(&vdpa_bus);
+	int err;
+
+	err = bus_register(&vdpa_bus);
+	if (err)
+		return err;
+	err = genl_register_family(&vdpa_nl_family);
+	if (err)
+		goto err;
+	return 0;
+
+err:
+	bus_unregister(&vdpa_bus);
+	return err;
 }
 
 static void __exit vdpa_exit(void)
 {
+	genl_unregister_family(&vdpa_nl_family);
 	bus_unregister(&vdpa_bus);
 	ida_destroy(&vdpa_index_ida);
 }
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 5700baa22356..3d6bc1fb909d 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -35,6 +35,8 @@ struct vdpa_vq_state {
 	u16	avail_index;
 };
 
+struct vdpa_parent_dev;
+
 /**
  * vDPA device - representation of a vDPA device
  * @dev: underlying device
@@ -335,4 +337,34 @@ static inline void vdpa_get_config(struct vdpa_device *vdev, unsigned offset,
 	ops->get_config(vdev, offset, buf, len);
 }
 
+/**
+ * vdpa_dev_ops - vdpa device ops
+ * @dev_add:	Add a vdpa device using alloc and register
+ *		@pdev: parent device to use for device addition
+ *		@name: name of the new vdpa device
+ *		@device_id: device id of the new vdpa device
+ *		Driver need to add a new device using vdpa_register_device() after
+ *		fully initializing the vdpa device. On successful addition driver
+ *		must return a valid pointer of vdpa device or ERR_PTR for the error.
+ * @dev_del:	Remove a vdpa device using unregister
+ *		@pdev: parent device to use for device removal
+ *		@dev: vdpa device to remove
+ *		Driver need to remove the specified device by calling vdpa_unregister_device().
+ */
+struct vdpa_dev_ops {
+	struct vdpa_device* (*dev_add)(struct vdpa_parent_dev *pdev, const char *name,
+				       u32 device_id);
+	void (*dev_del)(struct vdpa_parent_dev *pdev, struct vdpa_device *dev);
+};
+
+struct vdpa_parent_dev {
+	struct device *device;
+	const struct vdpa_dev_ops *ops;
+	const struct virtio_device_id *id_table; /* supported ids */
+	struct list_head list;
+};
+
+int vdpa_parentdev_register(struct vdpa_parent_dev *pdev);
+void vdpa_parentdev_unregister(struct vdpa_parent_dev *pdev);
+
 #endif /* _LINUX_VDPA_H */
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
new file mode 100644
index 000000000000..6d88022f6a95
--- /dev/null
+++ b/include/uapi/linux/vdpa.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/*
+ * vdpa device management interface
+ * Copyright (c) 2020 Mellanox Technologies Ltd. All rights reserved.
+ */
+
+#ifndef _UAPI_LINUX_VDPA_H_
+#define _UAPI_LINUX_VDPA_H_
+
+#define VDPA_GENL_NAME "vdpa"
+#define VDPA_GENL_VERSION 0x1
+#define VDPA_GENL_MCGRP_CONFIG_NAME "config"
+
+enum vdpa_command {
+	VDPA_CMD_UNSPEC,
+	VDPA_CMD_PARENTDEV_NEW,
+	VDPA_CMD_PARENTDEV_GET,		/* can dump */
+};
+
+enum vdpa_attr {
+	VDPA_ATTR_UNSPEC,
+
+	/* bus name (optional) + dev name together make the parent device handle */
+	VDPA_ATTR_PARENTDEV_BUS_NAME,		/* string */
+	VDPA_ATTR_PARENTDEV_DEV_NAME,		/* string */
+	VDPA_ATTR_PARENTDEV_SUPPORTED_CLASSES,	/* u64 */
+
+	/* new attributes must be added above here */
+	VDPA_ATTR_MAX,
+};
+
+#endif
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 5/7] vdpa: Enable a user to add and delete a vdpa device
  2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
                   ` (3 preceding siblings ...)
  2020-11-12  6:40 ` [PATCH 4/7] vdpa: Define vdpa parent device, ops and a netlink interface Parav Pandit
@ 2020-11-12  6:40 ` Parav Pandit
  2020-11-12  6:40 ` [PATCH 6/7] vdpa: Enable user to query vdpa device info Parav Pandit
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2020-11-12  6:40 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

Add the ability to add and delete a vdpa device.

Examples:
Create a vdpa device of type network named "foo2" from
the parent device vdpasim:

$ vdpa dev add parentdev vdpasim type net name foo2

Delete the vdpa device after its use:
$ vdpa dev del foo2

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vdpa/vdpa.c       | 149 +++++++++++++++++++++++++++++++++++---
 include/linux/vdpa.h      |   6 ++
 include/uapi/linux/vdpa.h |   5 ++
 3 files changed, 150 insertions(+), 10 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 273639038851..fcbdc8f10206 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -136,6 +136,37 @@ static int vdpa_name_match(struct device *dev, const void *data)
 	return (strcmp(dev_name(&vdev->dev), data) == 0);
 }
 
+static int __vdpa_register_device(struct vdpa_device *vdev)
+{
+	struct device *dev;
+
+	lockdep_assert_held(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, dev_name(&vdev->dev), vdpa_name_match);
+	if (dev) {
+		put_device(dev);
+		return -EEXIST;
+	}
+	return device_add(&vdev->dev);
+}
+
+/**
+ * _vdpa_register_device - register a vDPA device with vdpa lock held
+ * Caller must have a succeed call of vdpa_alloc_device() before.
+ * Caller must invoke this routine as part of parent device dev_add() callback
+ * after setting up valid parentdev for this vdpa device.
+ * @vdev: the vdpa device to be registered to vDPA bus
+ *
+ * Returns an error when fail to add device to vDPA bus
+ */
+int _vdpa_register_device(struct vdpa_device *vdev)
+{
+	if (!vdev->pdev)
+		return -EINVAL;
+
+	return __vdpa_register_device(vdev);
+}
+EXPORT_SYMBOL_GPL(_vdpa_register_device);
+
 /**
  * vdpa_register_device - register a vDPA device
  * Callers must have a succeed call of vdpa_alloc_device() before.
@@ -145,24 +176,28 @@ static int vdpa_name_match(struct device *dev, const void *data)
  */
 int vdpa_register_device(struct vdpa_device *vdev)
 {
-	struct device *dev;
 	int err;
 
 	mutex_lock(&vdpa_dev_mutex);
-	dev = bus_find_device(&vdpa_bus, NULL, dev_name(&vdev->dev), vdpa_name_match);
-	if (dev) {
-		put_device(dev);
-		err = -EEXIST;
-		goto name_err;
-	}
-
-	err = device_add(&vdev->dev);
-name_err:
+	err = __vdpa_register_device(vdev);
 	mutex_unlock(&vdpa_dev_mutex);
 	return err;
 }
 EXPORT_SYMBOL_GPL(vdpa_register_device);
 
+/**
+ * _vdpa_unregister_device - unregister a vDPA device
+ * Caller must invoke this routine as part of parent device dev_del() callback.
+ * @vdev: the vdpa device to be unregisted from vDPA bus
+ */
+void _vdpa_unregister_device(struct vdpa_device *vdev)
+{
+	lockdep_assert_held(&vdpa_dev_mutex);
+	WARN_ON(!vdev->pdev);
+	device_unregister(&vdev->dev);
+}
+EXPORT_SYMBOL_GPL(_vdpa_unregister_device);
+
 /**
  * vdpa_unregister_device - unregister a vDPA device
  * @vdev: the vdpa device to be unregisted from vDPA bus
@@ -221,10 +256,25 @@ int vdpa_parentdev_register(struct vdpa_parent_dev *pdev)
 }
 EXPORT_SYMBOL_GPL(vdpa_parentdev_register);
 
+static int vdpa_match_remove(struct device *dev, void *data)
+{
+	struct vdpa_device *vdev = container_of(dev, struct vdpa_device, dev);
+	struct vdpa_parent_dev *pdev = vdev->pdev;
+
+	if (pdev == data)
+		pdev->ops->dev_del(pdev, vdev);
+	return 0;
+}
+
 void vdpa_parentdev_unregister(struct vdpa_parent_dev *pdev)
 {
 	mutex_lock(&vdpa_dev_mutex);
+
 	list_del(&pdev->list);
+
+	/* Filter out all the entries belong to this parent device and delete it. */
+	bus_for_each_dev(&vdpa_bus, NULL, pdev, vdpa_match_remove);
+
 	mutex_unlock(&vdpa_dev_mutex);
 }
 EXPORT_SYMBOL_GPL(vdpa_parentdev_unregister);
@@ -368,9 +418,76 @@ vdpa_nl_cmd_parentdev_get_dumpit(struct sk_buff *msg, struct netlink_callback *c
 	return msg->len;
 }
 
+static int vdpa_nl_cmd_dev_add_set_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_parent_dev *pdev;
+	struct vdpa_device *vdev;
+	const char *name;
+	u32 device_id;
+	int err = 0;
+
+	if (!info->attrs[VDPA_ATTR_DEV_NAME] || !info->attrs[VDPA_ATTR_DEV_ID])
+		return -EINVAL;
+
+	name = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
+	device_id = nla_get_u32(info->attrs[VDPA_ATTR_DEV_ID]);
+
+	mutex_lock(&vdpa_dev_mutex);
+	pdev = vdpa_parentdev_get_from_attr(info->attrs);
+	if (IS_ERR(pdev)) {
+		NL_SET_ERR_MSG_MOD(info->extack, "Fail to find the specified parent device");
+		err = PTR_ERR(pdev);
+		goto err;
+	}
+
+	vdev = pdev->ops->dev_add(pdev, name, device_id);
+	if (IS_ERR(vdev))
+		goto err;
+
+err:
+	mutex_unlock(&vdpa_dev_mutex);
+	return err;
+}
+
+static int vdpa_nl_cmd_dev_del_set_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_parent_dev *pdev;
+	struct vdpa_device *vdev;
+	struct device *dev;
+	const char *name;
+	int err = 0;
+
+	if (!info->attrs[VDPA_ATTR_DEV_NAME])
+		return -EINVAL;
+	name = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
+
+	mutex_lock(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, name, vdpa_name_match);
+	if (!dev) {
+		NL_SET_ERR_MSG_MOD(info->extack, "device not found");
+		err = -ENODEV;
+		goto dev_err;
+	}
+	vdev = container_of(dev, struct vdpa_device, dev);
+	if (!vdev->pdev) {
+		NL_SET_ERR_MSG_MOD(info->extack, "Only user created device can be deleted by user");
+		err = -EINVAL;
+		goto pdev_err;
+	}
+	pdev = vdev->pdev;
+	pdev->ops->dev_del(pdev, vdev);
+pdev_err:
+	put_device(dev);
+dev_err:
+	mutex_unlock(&vdpa_dev_mutex);
+	return err;
+}
+
 static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX] = {
 	[VDPA_ATTR_PARENTDEV_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[VDPA_ATTR_PARENTDEV_DEV_NAME] = { .type = NLA_STRING },
+	[VDPA_ATTR_DEV_NAME] = { .type = NLA_STRING },
+	[VDPA_ATTR_DEV_ID] = { .type = NLA_U32 },
 };
 
 static const struct genl_ops vdpa_nl_ops[] = {
@@ -380,6 +497,18 @@ static const struct genl_ops vdpa_nl_ops[] = {
 		.doit = vdpa_nl_cmd_parentdev_get_doit,
 		.dumpit = vdpa_nl_cmd_parentdev_get_dumpit,
 	},
+	{
+		.cmd = VDPA_CMD_DEV_NEW,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_dev_add_set_doit,
+		.flags = GENL_ADMIN_PERM,
+	},
+	{
+		.cmd = VDPA_CMD_DEV_DEL,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_dev_del_set_doit,
+		.flags = GENL_ADMIN_PERM,
+	},
 };
 
 static struct genl_family vdpa_nl_family __ro_after_init = {
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 3d6bc1fb909d..cb5a3d847af3 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -45,6 +45,8 @@ struct vdpa_parent_dev;
  * @index: device index
  * @features_valid: were features initialized? for legacy guests
  * @nvqs: maximum number of supported virtqueues
+ * @pdev: parent device pointer; caller must setup when registering device as part
+ *	  of dev_add() parentdev ops callback before invoking _vdpa_register_device().
  */
 struct vdpa_device {
 	struct device dev;
@@ -53,6 +55,7 @@ struct vdpa_device {
 	unsigned int index;
 	bool features_valid;
 	int nvqs;
+	struct vdpa_parent_dev *pdev;
 };
 
 /**
@@ -260,6 +263,9 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 int vdpa_register_device(struct vdpa_device *vdev);
 void vdpa_unregister_device(struct vdpa_device *vdev);
 
+int _vdpa_register_device(struct vdpa_device *vdev);
+void _vdpa_unregister_device(struct vdpa_device *vdev);
+
 /**
  * vdpa_driver - operations for a vDPA driver
  * @driver: underlying device driver
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index 6d88022f6a95..c528a9cfd6c9 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -15,6 +15,8 @@ enum vdpa_command {
 	VDPA_CMD_UNSPEC,
 	VDPA_CMD_PARENTDEV_NEW,
 	VDPA_CMD_PARENTDEV_GET,		/* can dump */
+	VDPA_CMD_DEV_NEW,
+	VDPA_CMD_DEV_DEL,
 };
 
 enum vdpa_attr {
@@ -25,6 +27,9 @@ enum vdpa_attr {
 	VDPA_ATTR_PARENTDEV_DEV_NAME,		/* string */
 	VDPA_ATTR_PARENTDEV_SUPPORTED_CLASSES,	/* u64 */
 
+	VDPA_ATTR_DEV_NAME,			/* string */
+	VDPA_ATTR_DEV_ID,			/* u32 */
+
 	/* new attributes must be added above here */
 	VDPA_ATTR_MAX,
 };
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 6/7] vdpa: Enable user to query vdpa device info
  2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
                   ` (4 preceding siblings ...)
  2020-11-12  6:40 ` [PATCH 5/7] vdpa: Enable a user to add and delete a vdpa device Parav Pandit
@ 2020-11-12  6:40 ` Parav Pandit
  2020-11-12  6:40 ` [PATCH 7/7] vdpa/vdpa_sim: Enable user to create vdpasim net devices Parav Pandit
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2020-11-12  6:40 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

Enable user to query vdpa device information.

$ vdpa dev add parentdev vdpasim type net name foo2

Show the newly created vdpa device by its name:
$ vdpa dev show foo2
foo2: type network parentdev vdpasim vendor_id 0 max_vqs 2 max_vq_size 256

$ vdpa dev show foo2 -jp
{
    "dev": {
        "foo2": {
            "type": "network",
            "parentdev": "vdpasim",
            "vendor_id": 0,
            "max_vqs": 2,
            "max_vq_size": 256
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vdpa/vdpa.c       | 131 ++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/vdpa.h |   4 ++
 2 files changed, 135 insertions(+)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index fcbdc8f10206..32bd48baffab 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -483,6 +483,131 @@ static int vdpa_nl_cmd_dev_del_set_doit(struct sk_buff *skb, struct genl_info *i
 	return err;
 }
 
+static int
+vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq,
+	      int flags, struct netlink_ext_ack *extack)
+{
+	u16 max_vq_size;
+	u32 device_id;
+	u32 vendor_id;
+	void *hdr;
+	int err;
+
+	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags, VDPA_CMD_DEV_NEW);
+	if (!hdr)
+		return -EMSGSIZE;
+
+	err = vdpa_nl_parentdev_handle_fill(msg, vdev->pdev);
+	if (err)
+		goto msg_err;
+
+	device_id = vdev->config->get_device_id(vdev);
+	vendor_id = vdev->config->get_vendor_id(vdev);
+	max_vq_size = vdev->config->get_vq_num_max(vdev);
+
+	err = -EMSGSIZE;
+	if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
+		goto msg_err;
+	if (nla_put_u32(msg, VDPA_ATTR_DEV_ID, device_id))
+		goto msg_err;
+	if (nla_put_u32(msg, VDPA_ATTR_DEV_VENDOR_ID, vendor_id))
+		goto msg_err;
+	if (nla_put_u32(msg, VDPA_ATTR_DEV_MAX_VQS, vdev->nvqs))
+		goto msg_err;
+	if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
+		goto msg_err;
+
+	genlmsg_end(msg, hdr);
+	return 0;
+
+msg_err:
+	genlmsg_cancel(msg, hdr);
+	return err;
+}
+
+static int vdpa_nl_cmd_dev_get_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_device *vdev;
+	struct sk_buff *msg;
+	const char *devname;
+	struct device *dev;
+	int err;
+
+	if (!info->attrs[VDPA_ATTR_DEV_NAME])
+		return -EINVAL;
+	devname = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	mutex_lock(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, devname, vdpa_name_match);
+	if (!dev) {
+		mutex_unlock(&vdpa_dev_mutex);
+		NL_SET_ERR_MSG_MOD(info->extack, "device not found");
+		return -ENODEV;
+	}
+	vdev = container_of(dev, struct vdpa_device, dev);
+	if (!vdev->pdev) {
+		mutex_unlock(&vdpa_dev_mutex);
+		put_device(dev);
+		return -EINVAL;
+	}
+	err = vdpa_dev_fill(vdev, msg, info->snd_portid, info->snd_seq, 0, info->extack);
+	if (!err)
+		err = genlmsg_reply(msg, info);
+	put_device(dev);
+	mutex_unlock(&vdpa_dev_mutex);
+
+	if (err)
+		nlmsg_free(msg);
+	return err;
+}
+
+struct vdpa_dev_dump_info {
+	struct sk_buff *msg;
+	struct netlink_callback *cb;
+	int start_idx;
+	int idx;
+};
+
+static int vdpa_dev_dump(struct device *dev, void *data)
+{
+	struct vdpa_device *vdev = container_of(dev, struct vdpa_device, dev);
+	struct vdpa_dev_dump_info *info = data;
+	int err;
+
+	if (!vdev->pdev)
+		return 0;
+	if (info->idx < info->start_idx) {
+		info->idx++;
+		return 0;
+	}
+	err = vdpa_dev_fill(vdev, info->msg, NETLINK_CB(info->cb->skb).portid,
+			    info->cb->nlh->nlmsg_seq, NLM_F_MULTI, info->cb->extack);
+	if (err)
+		return err;
+
+	info->idx++;
+	return 0;
+}
+
+static int vdpa_nl_cmd_dev_get_dumpit(struct sk_buff *msg, struct netlink_callback *cb)
+{
+	struct vdpa_dev_dump_info info;
+
+	info.msg = msg;
+	info.cb = cb;
+	info.start_idx = cb->args[0];
+	info.idx = 0;
+
+	mutex_lock(&vdpa_dev_mutex);
+	bus_for_each_dev(&vdpa_bus, NULL, &info, vdpa_dev_dump);
+	mutex_unlock(&vdpa_dev_mutex);
+	cb->args[0] = info.idx;
+	return msg->len;
+}
+
 static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX] = {
 	[VDPA_ATTR_PARENTDEV_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[VDPA_ATTR_PARENTDEV_DEV_NAME] = { .type = NLA_STRING },
@@ -509,6 +634,12 @@ static const struct genl_ops vdpa_nl_ops[] = {
 		.doit = vdpa_nl_cmd_dev_del_set_doit,
 		.flags = GENL_ADMIN_PERM,
 	},
+	{
+		.cmd = VDPA_CMD_DEV_GET,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_dev_get_doit,
+		.dumpit = vdpa_nl_cmd_dev_get_dumpit,
+	},
 };
 
 static struct genl_family vdpa_nl_family __ro_after_init = {
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index c528a9cfd6c9..bba8b83a94b5 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -17,6 +17,7 @@ enum vdpa_command {
 	VDPA_CMD_PARENTDEV_GET,		/* can dump */
 	VDPA_CMD_DEV_NEW,
 	VDPA_CMD_DEV_DEL,
+	VDPA_CMD_DEV_GET,		/* can dump */
 };
 
 enum vdpa_attr {
@@ -29,6 +30,9 @@ enum vdpa_attr {
 
 	VDPA_ATTR_DEV_NAME,			/* string */
 	VDPA_ATTR_DEV_ID,			/* u32 */
+	VDPA_ATTR_DEV_VENDOR_ID,		/* u32 */
+	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
+	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
 
 	/* new attributes must be added above here */
 	VDPA_ATTR_MAX,
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 7/7] vdpa/vdpa_sim: Enable user to create vdpasim net devices
  2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
                   ` (5 preceding siblings ...)
  2020-11-12  6:40 ` [PATCH 6/7] vdpa: Enable user to query vdpa device info Parav Pandit
@ 2020-11-12  6:40 ` Parav Pandit
  2020-11-16  9:41 ` [PATCH 0/7] Introduce vdpa management tool Stefan Hajnoczi
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2020-11-12  6:40 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

Enable user to create vdpasim net simulate devices.

Show vdpa parent device that supports creating, deleting vdpa devices.

$ vdpa parentdev show
vdpasim:
  supported_classes
    net

$ vdpa parentdev show -jp
{
    "show": {
        "vdpasim": {
            "supported_classes": {
              "net"
        }
    }
}

Create a vdpa device of type networking named as "foo2" from
the parent device vdpasim:

$ vdpa dev add parentdev vdpasim type net name foo2

Show the newly created vdpa device by its name:
$ vdpa dev show foo2
foo2: type network parentdev vdpasim vendor_id 0 max_vqs 2 max_vq_size 256

$ vdpa dev show foo2 -jp
{
    "dev": {
        "foo2": {
            "type": "network",
            "parentdev": "vdpasim",
            "vendor_id": 0,
            "max_vqs": 2,
            "max_vq_size": 256
        }
    }
}

Delete the vdpa device after its use:
$ vdpa dev del foo2

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vdpa/vdpa_sim/vdpa_sim.c | 81 +++++++++++++++++++++++++++-----
 1 file changed, 69 insertions(+), 12 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index aed1bb7770ab..85776e4e6749 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -28,6 +28,7 @@
 #include <linux/vhost_iotlb.h>
 #include <uapi/linux/virtio_config.h>
 #include <uapi/linux/virtio_net.h>
+#include <uapi/linux/vdpa.h>
 
 #define DRV_VERSION  "0.1"
 #define DRV_AUTHOR   "Jason Wang <jasowang@redhat.com>"
@@ -42,6 +43,17 @@ static char *macaddr;
 module_param(macaddr, charp, 0);
 MODULE_PARM_DESC(macaddr, "Ethernet MAC address");
 
+static struct vdpa_parent_dev parent_dev;
+
+static void vdpasim_parent_release(struct device *dev)
+{
+}
+
+static struct device vdpasim_parent = {
+	.init_name = "vdpasim",
+	.release = vdpasim_parent_release,
+};
+
 struct vdpasim_virtqueue {
 	struct vringh vring;
 	struct vringh_kiov iov;
@@ -101,8 +113,6 @@ static inline __virtio16 cpu_to_vdpasim16(struct vdpasim *vdpasim, u16 val)
 	return __cpu_to_virtio16(vdpasim_is_little_endian(vdpasim), val);
 }
 
-static struct vdpasim *vdpasim_dev;
-
 static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
 {
 	return container_of(vdpa, struct vdpasim, vdpa);
@@ -345,7 +355,7 @@ static const struct dma_map_ops vdpasim_dma_ops = {
 static const struct vdpa_config_ops vdpasim_net_config_ops;
 static const struct vdpa_config_ops vdpasim_net_batch_config_ops;
 
-static struct vdpasim *vdpasim_create(void)
+static struct vdpasim *vdpasim_create(const char *name)
 {
 	const struct vdpa_config_ops *ops;
 	struct vdpasim *vdpasim;
@@ -357,7 +367,7 @@ static struct vdpasim *vdpasim_create(void)
 	else
 		ops = &vdpasim_net_config_ops;
 
-	vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, VDPASIM_VQ_NUM, NULL);
+	vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, VDPASIM_VQ_NUM, name);
 	if (!vdpasim)
 		goto err_alloc;
 
@@ -393,7 +403,8 @@ static struct vdpasim *vdpasim_create(void)
 	vringh_set_iotlb(&vdpasim->vqs[1].vring, vdpasim->iommu);
 
 	vdpasim->vdpa.dma_dev = dev;
-	ret = vdpa_register_device(&vdpasim->vdpa);
+	vdpasim->vdpa.pdev = &parent_dev;
+	ret = _vdpa_register_device(&vdpasim->vdpa);
 	if (ret)
 		goto err_iommu;
 
@@ -714,21 +725,67 @@ static const struct vdpa_config_ops vdpasim_net_batch_config_ops = {
 	.free                   = vdpasim_free,
 };
 
+static struct vdpa_device *
+vdpa_dev_add(struct vdpa_parent_dev *pdev, const char *name, u32 device_id)
+{
+	struct vdpasim *simdev;
+
+	if (device_id != VIRTIO_ID_NET)
+		return ERR_PTR(-EOPNOTSUPP);
+
+	simdev = vdpasim_create(name);
+	if (IS_ERR(simdev))
+		return (struct vdpa_device *)simdev;
+
+	return &simdev->vdpa;
+}
+
+static void vdpa_dev_del(struct vdpa_parent_dev *pdev, struct vdpa_device *dev)
+{
+	struct vdpasim *simdev = container_of(dev, struct vdpasim, vdpa);
+
+	_vdpa_unregister_device(&simdev->vdpa);
+}
+
+static const struct vdpa_dev_ops vdpa_dev_parent_ops = {
+	.dev_add = vdpa_dev_add,
+	.dev_del = vdpa_dev_del
+};
+
+static struct virtio_device_id id_table[] = {
+	{ VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
+	{ 0 },
+};
+
+static struct vdpa_parent_dev parent_dev = {
+	.device = &vdpasim_parent,
+	.id_table = id_table,
+	.ops = &vdpa_dev_parent_ops,
+};
+
 static int __init vdpasim_dev_init(void)
 {
-	vdpasim_dev = vdpasim_create();
+	int ret;
 
-	if (!IS_ERR(vdpasim_dev))
-		return 0;
+	ret = device_register(&vdpasim_parent);
+	if (ret)
+		return ret;
+
+	ret = vdpa_parentdev_register(&parent_dev);
+	if (ret)
+		goto parent_err;
 
-	return PTR_ERR(vdpasim_dev);
+	return 0;
+
+parent_err:
+	device_unregister(&vdpasim_parent);
+	return ret;
 }
 
 static void __exit vdpasim_dev_exit(void)
 {
-	struct vdpa_device *vdpa = &vdpasim_dev->vdpa;
-
-	vdpa_unregister_device(vdpa);
+	vdpa_parentdev_unregister(&parent_dev);
+	device_unregister(&vdpasim_parent);
 }
 
 module_init(vdpasim_dev_init)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/7] Introduce vdpa management tool
  2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
                   ` (6 preceding siblings ...)
  2020-11-12  6:40 ` [PATCH 7/7] vdpa/vdpa_sim: Enable user to create vdpasim net devices Parav Pandit
@ 2020-11-16  9:41 ` Stefan Hajnoczi
  2020-11-17 19:41   ` Parav Pandit
  2020-11-16 22:23 ` Jakub Kicinski
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 79+ messages in thread
From: Stefan Hajnoczi @ 2020-11-16  9:41 UTC (permalink / raw)
  To: Parav Pandit; +Cc: Linux Virtualization, netdev, elic, Michael S. Tsirkin

Great! A few questions and comments:

How are configuration parameters passed in during device creation
(e.g. MAC address, number of queues)?

Can configuration parameters be changed at runtime (e.g. link up/down)?

Does the configuration parameter interface distinguish between
standard and vendor-specific parameters? Are they namespaced to
prevent naming collisions?

How are software-only parent drivers supported? It's kind of a shame
to modprobe unconditionally if they won't be used. Does vdpatool have
some way of requesting loading a parent driver? That way software
drivers can be loaded on demand.

What is the benefit of making it part of iproute2? If there is not a
significant advantage like sharing code, then I suggest using a
separate repository and package so vdpatool can be installed
separately (e.g. even on AF_VSOCK-only guests without Ethernet).

Stefan

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/7] Introduce vdpa management tool
  2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
                   ` (7 preceding siblings ...)
  2020-11-16  9:41 ` [PATCH 0/7] Introduce vdpa management tool Stefan Hajnoczi
@ 2020-11-16 22:23 ` Jakub Kicinski
  2020-11-17 19:51   ` Parav Pandit
  2020-11-27  3:53 ` Jason Wang
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 79+ messages in thread
From: Jakub Kicinski @ 2020-11-16 22:23 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtualization, mst, jasowang, elic, netdev

On Thu, 12 Nov 2020 08:39:58 +0200 Parav Pandit wrote:
> FAQs:
> -----
> 1. Where does userspace vdpa tool reside which users can use?
> Ans: vdpa tool can possibly reside in iproute2 [1] as it enables user to
> create vdpa net devices.
> 
> 2. Why not create and delete vdpa device using sysfs/configfs?
> Ans:

> 3. Why not use ioctl() interface?

Obviously I'm gonna ask you - why can't you use devlink?

> Next steps:
> -----------
> (a) Post this patchset and iproute2/vdpa inclusion, remaining two drivers
> will be coverted to support vdpa tool instead of creating unmanaged default
> device on driver load.
> (b) More net specific parameters such as mac, mtu will be added.

How does MAC and MTU belong in this new VDPA thing?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH 0/7] Introduce vdpa management tool
  2020-11-16  9:41 ` [PATCH 0/7] Introduce vdpa management tool Stefan Hajnoczi
@ 2020-11-17 19:41   ` Parav Pandit
  0 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2020-11-17 19:41 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Linux Virtualization, netdev, Eli Cohen, Michael S. Tsirkin



> From: Stefan Hajnoczi <stefanha@gmail.com>
> Sent: Monday, November 16, 2020 3:11 PM
> Great! A few questions and comments:
> 
> How are configuration parameters passed in during device creation (e.g.
> MAC address, number of queues)?
During device creation time more parameters to be added.
> 
> Can configuration parameters be changed at runtime (e.g. link up/down)?
> 
For representor eswitch based devices, it is usually controlled through it.
For others, I haven't thought about it. If the device supports it, I believe so.
If multiple vpda devices are created over single VF/PF/SF, virtualizing the link for up/down (not just changing the vdpa config bits) can be a challenge.

> Does the configuration parameter interface distinguish between standard
> and vendor-specific parameters? Are they namespaced to prevent naming
> collisions?
Do you have an example of vendor specific parameters?
Since this tool exposes virtio compliant vdpa devices, I didn't consider any vendor specific params.

> 
> How are software-only parent drivers supported? It's kind of a shame to
> modprobe unconditionally if they won't be used. Does vdpatool have some
> way of requesting loading a parent driver? That way software drivers can be
> loaded on demand.
Well, since each parent or management device registers for it, and their type is same, there isn't a way right not to auto load the module.
This will require user to learn what type of vendor device driver to be loaded, which kinds of defeats the purpose.

> 
> What is the benefit of making it part of iproute2? If there is not a significant
> advantage like sharing code, then I suggest using a separate repository and
> package so vdpatool can be installed separately (e.g. even on AF_VSOCK-
> only guests without Ethernet).
Given that vdpa tool intents to create network specific devices, iproute2 seems a better fit than a own repository.
It mainly uses libmnl.

> 
> Stefan

^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH 0/7] Introduce vdpa management tool
  2020-11-16 22:23 ` Jakub Kicinski
@ 2020-11-17 19:51   ` Parav Pandit
  2020-12-16  9:13     ` Michael S. Tsirkin
  0 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2020-11-17 19:51 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: virtualization, mst, jasowang, Eli Cohen, netdev



> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Tuesday, November 17, 2020 3:53 AM
> 
> On Thu, 12 Nov 2020 08:39:58 +0200 Parav Pandit wrote:
> > FAQs:
> > -----
> > 1. Where does userspace vdpa tool reside which users can use?
> > Ans: vdpa tool can possibly reside in iproute2 [1] as it enables user
> > to create vdpa net devices.
> >
> > 2. Why not create and delete vdpa device using sysfs/configfs?
> > Ans:
> 
> > 3. Why not use ioctl() interface?
> 
> Obviously I'm gonna ask you - why can't you use devlink?
> 
This was considered.
However it seems that extending devlink for vdpa specific stats, devices, config sounds overloading devlink beyond its defined scope.

> > Next steps:
> > -----------
> > (a) Post this patchset and iproute2/vdpa inclusion, remaining two
> > drivers will be coverted to support vdpa tool instead of creating
> > unmanaged default device on driver load.
> > (b) More net specific parameters such as mac, mtu will be added.
> 
> How does MAC and MTU belong in this new VDPA thing?
MAC only make sense when user wants to run VF/SF Netdev and vdpa together with different mac address.
Otherwise existing devlink well defined API to have one MAC per function is fine.
Same for MTU, if queues of vdpa vs VF/SF Netdev queues wants have different MTU it make sense to add configure per vdpa device.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/7] Introduce vdpa management tool
  2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
                   ` (8 preceding siblings ...)
  2020-11-16 22:23 ` Jakub Kicinski
@ 2020-11-27  3:53 ` Jason Wang
       [not found]   ` <CACycT3sYScObb9nN3g7L3cesjE7sCZWxZ5_5R1usGU9ePZEeqA@mail.gmail.com>
  2020-12-08 22:47   ` David Ahern
  2020-12-16  9:16 ` Michael S. Tsirkin
                   ` (2 subsequent siblings)
  12 siblings, 2 replies; 79+ messages in thread
From: Jason Wang @ 2020-11-27  3:53 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: mst, elic, netdev, 谢永吉


On 2020/11/12 下午2:39, Parav Pandit wrote:
> This patchset covers user requirements for managing existing vdpa devices,
> using a tool and its internal design notes for kernel drivers.
>
> Background and user requirements:
> ----------------------------------
> (1) Currently VDPA device is created by driver when driver is loaded.
> However, user should have a choice when to create or not create a vdpa device
> for the underlying parent device.
>
> For example, mlx5 PCI VF and subfunction device supports multiple classes of
> device such netdev, vdpa, rdma. Howevever it is not required to always created
> vdpa device for such device.
>
> (2) In another use case, a device may support creating one or multiple vdpa
> device of same or different class such as net and block.
> Creating vdpa devices at driver load time further limits this use case.
>
> (3) A user should be able to monitor and query vdpa queue level or device level
> statistics for a given vdpa device.
>
> (4) A user should be able to query what class of vdpa devices are supported
> by its parent device.
>
> (5) A user should be able to view supported features and negotiated features
> of the vdpa device.
>
> (6) A user should be able to create a vdpa device in vendor agnostic manner
> using single tool.
>
> Hence, it is required to have a tool through which user can create one or more
> vdpa devices from a parent device which addresses above user requirements.
>
> Example devices:
> ----------------
>   +-----------+ +-----------+ +---------+ +--------+ +-----------+
>   |vdpa dev 0 | |vdpa dev 1 | |rdma dev | |netdev  | |vdpa dev 3 |
>   |type=net   | |type=block | |mlx5_0   | |ens3f0  | |type=net   |
>   +----+------+ +-----+-----+ +----+----+ +-----+--+ +----+------+
>        |              |            |            |         |
>        |              |            |            |         |
>   +----+-----+        |       +----+----+       |    +----+----+
>   |  mlx5    +--------+       |mlx5     +-------+    |mlx5     |
>   |pci vf 2  |                |pci vf 4 |            |pci sf 8 |
>   |03:00:2   |                |03:00.4  |            |mlx5_sf.8|
>   +----+-----+                +----+----+            +----+----+
>        |                           |                      |
>        |                      +----+-----+                |
>        +----------------------+mlx5      +----------------+
>                               |pci pf 0  |
>                               |03:00.0   |
>                               +----------+
>
> vdpa tool:
> ----------
> vdpa tool is a tool to create, delete vdpa devices from a parent device. It is a
> tool that enables user to query statistics, features and may be more attributes
> in future.
>
> vdpa tool command draft:
> ------------------------
> (a) List parent devices which supports creating vdpa devices.
> It also shows which class types supported by this parent device.
> In below command example two parent devices support vdpa device creation.
> First is PCI VF whose bdf is 03.00:2.
> Second is PCI VF whose name is 03:00.4.
> Third is PCI SF whose name is mlx5_core.sf.8
>
> $ vdpa parentdev list
> vdpasim
>    supported_classes
>      net
> pci/0000:03.00:3
>    supported_classes
>      net block
> pci/0000:03.00:4
>    supported_classes
>      net block
> auxiliary/mlx5_core.sf.8
>    supported_classes
>      net
>
> (b) Now add a vdpa device of networking class and show the device.
> $ vdpa dev add parentdev pci/0000:03.00:2 type net name foo0 $ vdpa dev show foo0
> foo0: parentdev pci/0000:03.00:2 type network parentdev vdpasim vendor_id 0 max_vqs 2 max_vq_size 256
>
> (c) Show features of a vdpa device
> $ vdpa dev features show foo0
> supported
>    iommu platform
>    version 1
>
> (d) Dump vdpa device statistics
> $ vdpa dev stats show foo0
> kickdoorbells 10
> wqes 100
>
> (e) Now delete a vdpa device previously created.
> $ vdpa dev del foo0
>
> vdpa tool support in this patchset:
> -----------------------------------
> vdpa tool is created to create, delete and query vdpa devices.
> examples:
> Show vdpa parent device that supports creating, deleting vdpa devices.
>
> $ vdpa parentdev show
> vdpasim:
>    supported_classes
>      net
>
> $ vdpa parentdev show -jp
> {
>      "show": {
>         "vdpasim": {
>            "supported_classes": {
>               "net"
>          }
>      }
> }
>
> Create a vdpa device of type networking named as "foo2" from the parent device vdpasim:
>
> $ vdpa dev add parentdev vdpasim type net name foo2
>
> Show the newly created vdpa device by its name:
> $ vdpa dev show foo2
> foo2: type network parentdev vdpasim vendor_id 0 max_vqs 2 max_vq_size 256
>
> $ vdpa dev show foo2 -jp
> {
>      "dev": {
>          "foo2": {
>              "type": "network",
>              "parentdev": "vdpasim",
>              "vendor_id": 0,
>              "max_vqs": 2,
>              "max_vq_size": 256
>          }
>      }
> }
>
> Delete the vdpa device after its use:
> $ vdpa dev del foo2
>
> vdpa tool support by kernel:
> ----------------------------
> vdpa tool user interface will be supported by existing vdpa kernel framework,
> i.e. drivers/vdpa/vdpa.c It services user command through a netlink interface.
>
> Each parent device registers supported callback operations with vdpa subsystem
> through which vdpa device(s) can be managed.
>
> FAQs:
> -----
> 1. Where does userspace vdpa tool reside which users can use?
> Ans: vdpa tool can possibly reside in iproute2 [1] as it enables user to
> create vdpa net devices.
>
> 2. Why not create and delete vdpa device using sysfs/configfs?
> Ans:
> (a) A device creation may involve passing one or more attributes.
> Passing multiple attributes and returning error code and more verbose
> information for invalid attributes cannot be handled by sysfs/configfs.
>
> (b) netlink framework is rich that enables user space and kernel driver to
> provide nested attributes.
>
> (c) Exposing device specific file under sysfs without net namespace
> awareness exposes details to multiple containers. Instead exposing
> attributes via a netlink socket secures the communication channel with kernel.
>
> (d) netlink socket interface enables to run syscaller kernel tests.
>
> 3. Why not use ioctl() interface?
> Ans: ioctl() interface replicates the necessary plumbing which already
> exists through netlink socket.
>
> 4. What happens when one or more user created vdpa devices exist for a
> parent PCI VF or SF and such parent device is removed?
> Ans: All user created vdpa devices are removed that belong to a parent.
>
> [1] git://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git
>
> Next steps:
> -----------
> (a) Post this patchset and iproute2/vdpa inclusion, remaining two drivers
> will be coverted to support vdpa tool instead of creating unmanaged default
> device on driver load.
> (b) More net specific parameters such as mac, mtu will be added.
> (c) Features bits get and set interface will be added.


Adding Yong Ji for sharing some thoughts from the view of userspace vDPA 
device.

Thanks



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool
       [not found]   ` <CACycT3sYScObb9nN3g7L3cesjE7sCZWxZ5_5R1usGU9ePZEeqA@mail.gmail.com>
@ 2020-11-30  3:36     ` Jason Wang
  2020-11-30  7:07       ` Yongji Xie
  0 siblings, 1 reply; 79+ messages in thread
From: Jason Wang @ 2020-11-30  3:36 UTC (permalink / raw)
  To: Yongji Xie, Parav Pandit; +Cc: virtualization, Michael S. Tsirkin, elic, netdev


On 2020/11/27 下午1:52, Yongji Xie wrote:
> On Fri, Nov 27, 2020 at 11:53 AM Jason Wang <jasowang@redhat.com 
> <mailto:jasowang@redhat.com>> wrote:
>
>
>     On 2020/11/12 下午2:39, Parav Pandit wrote:
>     > This patchset covers user requirements for managing existing
>     vdpa devices,
>     > using a tool and its internal design notes for kernel drivers.
>     >
>     > Background and user requirements:
>     > ----------------------------------
>     > (1) Currently VDPA device is created by driver when driver is
>     loaded.
>     > However, user should have a choice when to create or not create
>     a vdpa device
>     > for the underlying parent device.
>     >
>     > For example, mlx5 PCI VF and subfunction device supports
>     multiple classes of
>     > device such netdev, vdpa, rdma. Howevever it is not required to
>     always created
>     > vdpa device for such device.
>     >
>     > (2) In another use case, a device may support creating one or
>     multiple vdpa
>     > device of same or different class such as net and block.
>     > Creating vdpa devices at driver load time further limits this
>     use case.
>     >
>     > (3) A user should be able to monitor and query vdpa queue level
>     or device level
>     > statistics for a given vdpa device.
>     >
>     > (4) A user should be able to query what class of vdpa devices
>     are supported
>     > by its parent device.
>     >
>     > (5) A user should be able to view supported features and
>     negotiated features
>     > of the vdpa device.
>     >
>     > (6) A user should be able to create a vdpa device in vendor
>     agnostic manner
>     > using single tool.
>     >
>     > Hence, it is required to have a tool through which user can
>     create one or more
>     > vdpa devices from a parent device which addresses above user
>     requirements.
>     >
>     > Example devices:
>     > ----------------
>     >   +-----------+ +-----------+ +---------+ +--------+ +-----------+
>     >   |vdpa dev 0 | |vdpa dev 1 | |rdma dev | |netdev  | |vdpa dev 3 |
>     >   |type=net   | |type=block | |mlx5_0   | |ens3f0  | |type=net   |
>     >   +----+------+ +-----+-----+ +----+----+ +-----+--+ +----+------+
>     >        |              |            |            |    |
>     >        |              |            |            |    |
>     >   +----+-----+        |       +----+----+       | +----+----+
>     >   |  mlx5    +--------+       |mlx5     +-------+ |mlx5     |
>     >   |pci vf 2  |                |pci vf 4 | |pci sf 8 |
>     >   |03:00:2   |                |03:00.4  | |mlx5_sf.8|
>     >   +----+-----+                +----+----+ +----+----+
>     >        |                           |   |
>     >        |                      +----+-----+   |
>     >        +----------------------+mlx5 +----------------+
>     >                               |pci pf 0  |
>     >                               |03:00.0   |
>     >                               +----------+
>     >
>     > vdpa tool:
>     > ----------
>     > vdpa tool is a tool to create, delete vdpa devices from a parent
>     device. It is a
>     > tool that enables user to query statistics, features and may be
>     more attributes
>     > in future.
>     >
>     > vdpa tool command draft:
>     > ------------------------
>     > (a) List parent devices which supports creating vdpa devices.
>     > It also shows which class types supported by this parent device.
>     > In below command example two parent devices support vdpa device
>     creation.
>     > First is PCI VF whose bdf is 03.00:2.
>     > Second is PCI VF whose name is 03:00.4.
>     > Third is PCI SF whose name is mlx5_core.sf.8
>     >
>     > $ vdpa parentdev list
>     > vdpasim
>     >    supported_classes
>     >      net
>     > pci/0000:03.00:3
>     >    supported_classes
>     >      net block
>     > pci/0000:03.00:4
>     >    supported_classes
>     >      net block
>     > auxiliary/mlx5_core.sf.8
>     >    supported_classes
>     >      net
>     >
>     > (b) Now add a vdpa device of networking class and show the device.
>     > $ vdpa dev add parentdev pci/0000:03.00:2 type net name foo0 $
>     vdpa dev show foo0
>     > foo0: parentdev pci/0000:03.00:2 type network parentdev vdpasim
>     vendor_id 0 max_vqs 2 max_vq_size 256
>     >
>     > (c) Show features of a vdpa device
>     > $ vdpa dev features show foo0
>     > supported
>     >    iommu platform
>     >    version 1
>     >
>     > (d) Dump vdpa device statistics
>     > $ vdpa dev stats show foo0
>     > kickdoorbells 10
>     > wqes 100
>     >
>     > (e) Now delete a vdpa device previously created.
>     > $ vdpa dev del foo0
>     >
>     > vdpa tool support in this patchset:
>     > -----------------------------------
>     > vdpa tool is created to create, delete and query vdpa devices.
>     > examples:
>     > Show vdpa parent device that supports creating, deleting vdpa
>     devices.
>     >
>     > $ vdpa parentdev show
>     > vdpasim:
>     >    supported_classes
>     >      net
>     >
>     > $ vdpa parentdev show -jp
>     > {
>     >      "show": {
>     >         "vdpasim": {
>     >            "supported_classes": {
>     >               "net"
>     >          }
>     >      }
>     > }
>     >
>     > Create a vdpa device of type networking named as "foo2" from the
>     parent device vdpasim:
>     >
>     > $ vdpa dev add parentdev vdpasim type net name foo2
>     >
>     > Show the newly created vdpa device by its name:
>     > $ vdpa dev show foo2
>     > foo2: type network parentdev vdpasim vendor_id 0 max_vqs 2
>     max_vq_size 256
>     >
>     > $ vdpa dev show foo2 -jp
>     > {
>     >      "dev": {
>     >          "foo2": {
>     >              "type": "network",
>     >              "parentdev": "vdpasim",
>     >              "vendor_id": 0,
>     >              "max_vqs": 2,
>     >              "max_vq_size": 256
>     >          }
>     >      }
>     > }
>     >
>     > Delete the vdpa device after its use:
>     > $ vdpa dev del foo2
>     >
>     > vdpa tool support by kernel:
>     > ----------------------------
>     > vdpa tool user interface will be supported by existing vdpa
>     kernel framework,
>     > i.e. drivers/vdpa/vdpa.c It services user command through a
>     netlink interface.
>     >
>     > Each parent device registers supported callback operations with
>     vdpa subsystem
>     > through which vdpa device(s) can be managed.
>     >
>     > FAQs:
>     > -----
>     > 1. Where does userspace vdpa tool reside which users can use?
>     > Ans: vdpa tool can possibly reside in iproute2 [1] as it enables
>     user to
>     > create vdpa net devices.
>     >
>     > 2. Why not create and delete vdpa device using sysfs/configfs?
>     > Ans:
>     > (a) A device creation may involve passing one or more attributes.
>     > Passing multiple attributes and returning error code and more
>     verbose
>     > information for invalid attributes cannot be handled by
>     sysfs/configfs.
>     >
>     > (b) netlink framework is rich that enables user space and kernel
>     driver to
>     > provide nested attributes.
>     >
>     > (c) Exposing device specific file under sysfs without net namespace
>     > awareness exposes details to multiple containers. Instead exposing
>     > attributes via a netlink socket secures the communication
>     channel with kernel.
>     >
>     > (d) netlink socket interface enables to run syscaller kernel tests.
>     >
>     > 3. Why not use ioctl() interface?
>     > Ans: ioctl() interface replicates the necessary plumbing which
>     already
>     > exists through netlink socket.
>     >
>     > 4. What happens when one or more user created vdpa devices exist
>     for a
>     > parent PCI VF or SF and such parent device is removed?
>     > Ans: All user created vdpa devices are removed that belong to a
>     parent.
>     >
>     > [1]
>     git://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git
>     <http://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git>
>     >
>     > Next steps:
>     > -----------
>     > (a) Post this patchset and iproute2/vdpa inclusion, remaining
>     two drivers
>     > will be coverted to support vdpa tool instead of creating
>     unmanaged default
>     > device on driver load.
>     > (b) More net specific parameters such as mac, mtu will be added.
>     > (c) Features bits get and set interface will be added.
>
>
>     Adding Yong Ji for sharing some thoughts from the view of
>     userspace vDPA
>     device.
>
>
> Thanks for adding me, Jason!
>
> Now I'm working on a v2 patchset for VDUSE (vDPA Device in Userspace) 
> [1]. This tool is very useful for the vduse device. So I'm considering 
> integrating this into my v2 patchset. But there is one problem:
>
> In this tool, vdpa device config action and enable action are combined 
> into one netlink msg: VDPA_CMD_DEV_NEW. But in vduse case, it needs to 
> be splitted because a chardev should be created and opened by a 
> userspace process before we enable the vdpa device (call 
> vdpa_register_device()).
>
> So I'd like to know whether it's possible (or have some plans) to add 
> two new netlink msgs something like: VDPA_CMD_DEV_ENABLE and 
> VDPA_CMD_DEV_DISABLE to make the config path more flexible.
>

Actually, we've discussed such intermediate step in some early 
discussion. It looks to me VDUSE could be one of the users of this.

Or I wonder whether we can switch to use anonymous inode(fd) for VDUSE 
then fetching it via an VDUSE_GET_DEVICE_FD ioctl?

Thanks


> Thanks,
> Yongji
>
> [1] https://www.spinics.net/lists/linux-mm/msg231576.html


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-11-30  3:36     ` [External] " Jason Wang
@ 2020-11-30  7:07       ` Yongji Xie
  2020-12-01  6:25         ` Jason Wang
  0 siblings, 1 reply; 79+ messages in thread
From: Yongji Xie @ 2020-11-30  7:07 UTC (permalink / raw)
  To: Jason Wang; +Cc: Parav Pandit, virtualization, Michael S. Tsirkin, elic, netdev

On Mon, Nov 30, 2020 at 11:36 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2020/11/27 下午1:52, Yongji Xie wrote:
> > On Fri, Nov 27, 2020 at 11:53 AM Jason Wang <jasowang@redhat.com
> > <mailto:jasowang@redhat.com>> wrote:
> >
> >
> >     On 2020/11/12 下午2:39, Parav Pandit wrote:
> >     > This patchset covers user requirements for managing existing
> >     vdpa devices,
> >     > using a tool and its internal design notes for kernel drivers.
> >     >
> >     > Background and user requirements:
> >     > ----------------------------------
> >     > (1) Currently VDPA device is created by driver when driver is
> >     loaded.
> >     > However, user should have a choice when to create or not create
> >     a vdpa device
> >     > for the underlying parent device.
> >     >
> >     > For example, mlx5 PCI VF and subfunction device supports
> >     multiple classes of
> >     > device such netdev, vdpa, rdma. Howevever it is not required to
> >     always created
> >     > vdpa device for such device.
> >     >
> >     > (2) In another use case, a device may support creating one or
> >     multiple vdpa
> >     > device of same or different class such as net and block.
> >     > Creating vdpa devices at driver load time further limits this
> >     use case.
> >     >
> >     > (3) A user should be able to monitor and query vdpa queue level
> >     or device level
> >     > statistics for a given vdpa device.
> >     >
> >     > (4) A user should be able to query what class of vdpa devices
> >     are supported
> >     > by its parent device.
> >     >
> >     > (5) A user should be able to view supported features and
> >     negotiated features
> >     > of the vdpa device.
> >     >
> >     > (6) A user should be able to create a vdpa device in vendor
> >     agnostic manner
> >     > using single tool.
> >     >
> >     > Hence, it is required to have a tool through which user can
> >     create one or more
> >     > vdpa devices from a parent device which addresses above user
> >     requirements.
> >     >
> >     > Example devices:
> >     > ----------------
> >     >   +-----------+ +-----------+ +---------+ +--------+ +-----------+
> >     >   |vdpa dev 0 | |vdpa dev 1 | |rdma dev | |netdev  | |vdpa dev 3 |
> >     >   |type=net   | |type=block | |mlx5_0   | |ens3f0  | |type=net   |
> >     >   +----+------+ +-----+-----+ +----+----+ +-----+--+ +----+------+
> >     >        |              |            |            |    |
> >     >        |              |            |            |    |
> >     >   +----+-----+        |       +----+----+       | +----+----+
> >     >   |  mlx5    +--------+       |mlx5     +-------+ |mlx5     |
> >     >   |pci vf 2  |                |pci vf 4 | |pci sf 8 |
> >     >   |03:00:2   |                |03:00.4  | |mlx5_sf.8|
> >     >   +----+-----+                +----+----+ +----+----+
> >     >        |                           |   |
> >     >        |                      +----+-----+   |
> >     >        +----------------------+mlx5 +----------------+
> >     >                               |pci pf 0  |
> >     >                               |03:00.0   |
> >     >                               +----------+
> >     >
> >     > vdpa tool:
> >     > ----------
> >     > vdpa tool is a tool to create, delete vdpa devices from a parent
> >     device. It is a
> >     > tool that enables user to query statistics, features and may be
> >     more attributes
> >     > in future.
> >     >
> >     > vdpa tool command draft:
> >     > ------------------------
> >     > (a) List parent devices which supports creating vdpa devices.
> >     > It also shows which class types supported by this parent device.
> >     > In below command example two parent devices support vdpa device
> >     creation.
> >     > First is PCI VF whose bdf is 03.00:2.
> >     > Second is PCI VF whose name is 03:00.4.
> >     > Third is PCI SF whose name is mlx5_core.sf.8
> >     >
> >     > $ vdpa parentdev list
> >     > vdpasim
> >     >    supported_classes
> >     >      net
> >     > pci/0000:03.00:3
> >     >    supported_classes
> >     >      net block
> >     > pci/0000:03.00:4
> >     >    supported_classes
> >     >      net block
> >     > auxiliary/mlx5_core.sf.8
> >     >    supported_classes
> >     >      net
> >     >
> >     > (b) Now add a vdpa device of networking class and show the device.
> >     > $ vdpa dev add parentdev pci/0000:03.00:2 type net name foo0 $
> >     vdpa dev show foo0
> >     > foo0: parentdev pci/0000:03.00:2 type network parentdev vdpasim
> >     vendor_id 0 max_vqs 2 max_vq_size 256
> >     >
> >     > (c) Show features of a vdpa device
> >     > $ vdpa dev features show foo0
> >     > supported
> >     >    iommu platform
> >     >    version 1
> >     >
> >     > (d) Dump vdpa device statistics
> >     > $ vdpa dev stats show foo0
> >     > kickdoorbells 10
> >     > wqes 100
> >     >
> >     > (e) Now delete a vdpa device previously created.
> >     > $ vdpa dev del foo0
> >     >
> >     > vdpa tool support in this patchset:
> >     > -----------------------------------
> >     > vdpa tool is created to create, delete and query vdpa devices.
> >     > examples:
> >     > Show vdpa parent device that supports creating, deleting vdpa
> >     devices.
> >     >
> >     > $ vdpa parentdev show
> >     > vdpasim:
> >     >    supported_classes
> >     >      net
> >     >
> >     > $ vdpa parentdev show -jp
> >     > {
> >     >      "show": {
> >     >         "vdpasim": {
> >     >            "supported_classes": {
> >     >               "net"
> >     >          }
> >     >      }
> >     > }
> >     >
> >     > Create a vdpa device of type networking named as "foo2" from the
> >     parent device vdpasim:
> >     >
> >     > $ vdpa dev add parentdev vdpasim type net name foo2
> >     >
> >     > Show the newly created vdpa device by its name:
> >     > $ vdpa dev show foo2
> >     > foo2: type network parentdev vdpasim vendor_id 0 max_vqs 2
> >     max_vq_size 256
> >     >
> >     > $ vdpa dev show foo2 -jp
> >     > {
> >     >      "dev": {
> >     >          "foo2": {
> >     >              "type": "network",
> >     >              "parentdev": "vdpasim",
> >     >              "vendor_id": 0,
> >     >              "max_vqs": 2,
> >     >              "max_vq_size": 256
> >     >          }
> >     >      }
> >     > }
> >     >
> >     > Delete the vdpa device after its use:
> >     > $ vdpa dev del foo2
> >     >
> >     > vdpa tool support by kernel:
> >     > ----------------------------
> >     > vdpa tool user interface will be supported by existing vdpa
> >     kernel framework,
> >     > i.e. drivers/vdpa/vdpa.c It services user command through a
> >     netlink interface.
> >     >
> >     > Each parent device registers supported callback operations with
> >     vdpa subsystem
> >     > through which vdpa device(s) can be managed.
> >     >
> >     > FAQs:
> >     > -----
> >     > 1. Where does userspace vdpa tool reside which users can use?
> >     > Ans: vdpa tool can possibly reside in iproute2 [1] as it enables
> >     user to
> >     > create vdpa net devices.
> >     >
> >     > 2. Why not create and delete vdpa device using sysfs/configfs?
> >     > Ans:
> >     > (a) A device creation may involve passing one or more attributes.
> >     > Passing multiple attributes and returning error code and more
> >     verbose
> >     > information for invalid attributes cannot be handled by
> >     sysfs/configfs.
> >     >
> >     > (b) netlink framework is rich that enables user space and kernel
> >     driver to
> >     > provide nested attributes.
> >     >
> >     > (c) Exposing device specific file under sysfs without net namespace
> >     > awareness exposes details to multiple containers. Instead exposing
> >     > attributes via a netlink socket secures the communication
> >     channel with kernel.
> >     >
> >     > (d) netlink socket interface enables to run syscaller kernel tests.
> >     >
> >     > 3. Why not use ioctl() interface?
> >     > Ans: ioctl() interface replicates the necessary plumbing which
> >     already
> >     > exists through netlink socket.
> >     >
> >     > 4. What happens when one or more user created vdpa devices exist
> >     for a
> >     > parent PCI VF or SF and such parent device is removed?
> >     > Ans: All user created vdpa devices are removed that belong to a
> >     parent.
> >     >
> >     > [1]
> >     git://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git
> >     <http://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git>
> >     >
> >     > Next steps:
> >     > -----------
> >     > (a) Post this patchset and iproute2/vdpa inclusion, remaining
> >     two drivers
> >     > will be coverted to support vdpa tool instead of creating
> >     unmanaged default
> >     > device on driver load.
> >     > (b) More net specific parameters such as mac, mtu will be added.
> >     > (c) Features bits get and set interface will be added.
> >
> >
> >     Adding Yong Ji for sharing some thoughts from the view of
> >     userspace vDPA
> >     device.
> >
> >
> > Thanks for adding me, Jason!
> >
> > Now I'm working on a v2 patchset for VDUSE (vDPA Device in Userspace)
> > [1]. This tool is very useful for the vduse device. So I'm considering
> > integrating this into my v2 patchset. But there is one problem:
> >
> > In this tool, vdpa device config action and enable action are combined
> > into one netlink msg: VDPA_CMD_DEV_NEW. But in vduse case, it needs to
> > be splitted because a chardev should be created and opened by a
> > userspace process before we enable the vdpa device (call
> > vdpa_register_device()).
> >
> > So I'd like to know whether it's possible (or have some plans) to add
> > two new netlink msgs something like: VDPA_CMD_DEV_ENABLE and
> > VDPA_CMD_DEV_DISABLE to make the config path more flexible.
> >
>
> Actually, we've discussed such intermediate step in some early
> discussion. It looks to me VDUSE could be one of the users of this.
>
> Or I wonder whether we can switch to use anonymous inode(fd) for VDUSE
> then fetching it via an VDUSE_GET_DEVICE_FD ioctl?
>

Yes, we can. Actually the current implementation in VDUSE is like
this.  But seems like this is still a intermediate step. The fd should
be binded to a name or something else which need to be configured
before.

Thanks,
Yongji

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-11-30  7:07       ` Yongji Xie
@ 2020-12-01  6:25         ` Jason Wang
  2020-12-01  9:55           ` Yongji Xie
  0 siblings, 1 reply; 79+ messages in thread
From: Jason Wang @ 2020-12-01  6:25 UTC (permalink / raw)
  To: Yongji Xie; +Cc: Parav Pandit, virtualization, Michael S. Tsirkin, elic, netdev


On 2020/11/30 下午3:07, Yongji Xie wrote:
>>> Thanks for adding me, Jason!
>>>
>>> Now I'm working on a v2 patchset for VDUSE (vDPA Device in Userspace)
>>> [1]. This tool is very useful for the vduse device. So I'm considering
>>> integrating this into my v2 patchset. But there is one problem:
>>>
>>> In this tool, vdpa device config action and enable action are combined
>>> into one netlink msg: VDPA_CMD_DEV_NEW. But in vduse case, it needs to
>>> be splitted because a chardev should be created and opened by a
>>> userspace process before we enable the vdpa device (call
>>> vdpa_register_device()).
>>>
>>> So I'd like to know whether it's possible (or have some plans) to add
>>> two new netlink msgs something like: VDPA_CMD_DEV_ENABLE and
>>> VDPA_CMD_DEV_DISABLE to make the config path more flexible.
>>>
>> Actually, we've discussed such intermediate step in some early
>> discussion. It looks to me VDUSE could be one of the users of this.
>>
>> Or I wonder whether we can switch to use anonymous inode(fd) for VDUSE
>> then fetching it via an VDUSE_GET_DEVICE_FD ioctl?
>>
> Yes, we can. Actually the current implementation in VDUSE is like
> this.  But seems like this is still a intermediate step. The fd should
> be binded to a name or something else which need to be configured
> before.


The name could be specified via the netlink. It looks to me the real 
issue is that until the device is connected with a userspace, it can't 
be used. So we also need to fail the enabling if it doesn't opened.

Thanks


>
> Thanks,
> Yongji
>


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-01  6:25         ` Jason Wang
@ 2020-12-01  9:55           ` Yongji Xie
  2020-12-01 11:32             ` Parav Pandit
  2020-12-02  5:48             ` Jason Wang
  0 siblings, 2 replies; 79+ messages in thread
From: Yongji Xie @ 2020-12-01  9:55 UTC (permalink / raw)
  To: Jason Wang; +Cc: Parav Pandit, virtualization, Michael S. Tsirkin, elic, netdev

On Tue, Dec 1, 2020 at 2:25 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2020/11/30 下午3:07, Yongji Xie wrote:
> >>> Thanks for adding me, Jason!
> >>>
> >>> Now I'm working on a v2 patchset for VDUSE (vDPA Device in Userspace)
> >>> [1]. This tool is very useful for the vduse device. So I'm considering
> >>> integrating this into my v2 patchset. But there is one problem:
> >>>
> >>> In this tool, vdpa device config action and enable action are combined
> >>> into one netlink msg: VDPA_CMD_DEV_NEW. But in vduse case, it needs to
> >>> be splitted because a chardev should be created and opened by a
> >>> userspace process before we enable the vdpa device (call
> >>> vdpa_register_device()).
> >>>
> >>> So I'd like to know whether it's possible (or have some plans) to add
> >>> two new netlink msgs something like: VDPA_CMD_DEV_ENABLE and
> >>> VDPA_CMD_DEV_DISABLE to make the config path more flexible.
> >>>
> >> Actually, we've discussed such intermediate step in some early
> >> discussion. It looks to me VDUSE could be one of the users of this.
> >>
> >> Or I wonder whether we can switch to use anonymous inode(fd) for VDUSE
> >> then fetching it via an VDUSE_GET_DEVICE_FD ioctl?
> >>
> > Yes, we can. Actually the current implementation in VDUSE is like
> > this.  But seems like this is still a intermediate step. The fd should
> > be binded to a name or something else which need to be configured
> > before.
>
>
> The name could be specified via the netlink. It looks to me the real
> issue is that until the device is connected with a userspace, it can't
> be used. So we also need to fail the enabling if it doesn't opened.
>

Yes, that's true. So you mean we can firstly try to fetch the fd
binded to a name/vduse_id via an VDUSE_GET_DEVICE_FD, then use the
name/vduse_id as a attribute to create vdpa device? It looks fine to
me.

Thanks,
Yongji

^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-01  9:55           ` Yongji Xie
@ 2020-12-01 11:32             ` Parav Pandit
  2020-12-01 14:18               ` Yongji Xie
  2020-12-02  5:48             ` Jason Wang
  1 sibling, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2020-12-01 11:32 UTC (permalink / raw)
  To: Yongji Xie, Jason Wang
  Cc: virtualization, Michael S. Tsirkin, Eli Cohen, netdev



> From: Yongji Xie <xieyongji@bytedance.com>
> Sent: Tuesday, December 1, 2020 3:26 PM
> 
> On Tue, Dec 1, 2020 at 2:25 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> >
> > On 2020/11/30 下午3:07, Yongji Xie wrote:
> > >>> Thanks for adding me, Jason!
> > >>>
> > >>> Now I'm working on a v2 patchset for VDUSE (vDPA Device in
> > >>> Userspace) [1]. This tool is very useful for the vduse device. So
> > >>> I'm considering integrating this into my v2 patchset. But there is
> > >>> one problem:
> > >>>
> > >>> In this tool, vdpa device config action and enable action are
> > >>> combined into one netlink msg: VDPA_CMD_DEV_NEW. But in vduse
> > >>> case, it needs to be splitted because a chardev should be created
> > >>> and opened by a userspace process before we enable the vdpa device
> > >>> (call vdpa_register_device()).
> > >>>
> > >>> So I'd like to know whether it's possible (or have some plans) to
> > >>> add two new netlink msgs something like: VDPA_CMD_DEV_ENABLE
> and
> > >>> VDPA_CMD_DEV_DISABLE to make the config path more flexible.
> > >>>
> > >> Actually, we've discussed such intermediate step in some early
> > >> discussion. It looks to me VDUSE could be one of the users of this.
> > >>
> > >> Or I wonder whether we can switch to use anonymous inode(fd) for
> > >> VDUSE then fetching it via an VDUSE_GET_DEVICE_FD ioctl?
> > >>
> > > Yes, we can. Actually the current implementation in VDUSE is like
> > > this.  But seems like this is still a intermediate step. The fd
> > > should be binded to a name or something else which need to be
> > > configured before.
> >
> >
> > The name could be specified via the netlink. It looks to me the real
> > issue is that until the device is connected with a userspace, it can't
> > be used. So we also need to fail the enabling if it doesn't opened.
> >
> 
> Yes, that's true. So you mean we can firstly try to fetch the fd binded to a
> name/vduse_id via an VDUSE_GET_DEVICE_FD, then use the
> name/vduse_id as a attribute to create vdpa device? It looks fine to me.

I probably do not well understand. I tried reading patch [1] and few things do not look correct as below.
Creating the vdpa device on the bus device and destroying the device from the workqueue seems unnecessary and racy.

It seems vduse driver needs 
This is something should be done as part of the vdpa dev add command, instead of connecting two sides separately and ensuring race free access to it.

So VDUSE_DEV_START and VDUSE_DEV_STOP should possibly be avoided.

$ vdpa dev add parentdev vduse_mgmtdev type net name foo2

When above command is executed it creates necessary vdpa device foo2 on the bus.
When user binds foo2 device with the vduse driver, in the probe(), it creates respective char device to access it from user space.
Depending on which driver foo2 device is bound it, it can be used, either via (a) existing vhost stack  or (b) some vdpa Netdev driver? (not sure its current state), or (c) vduse user space.

This will have sane model to me without races unless I am missing something fundamental here.
This way there are not two ways to create vdpa devices from user space.
Consumers can be of different types (vhost, vduse etc) of the bus device as above mentioned.

[1] https://www.spinics.net/lists/linux-mm/msg231581.html


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-01 11:32             ` Parav Pandit
@ 2020-12-01 14:18               ` Yongji Xie
  2020-12-01 15:58                 ` Parav Pandit
  0 siblings, 1 reply; 79+ messages in thread
From: Yongji Xie @ 2020-12-01 14:18 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtualization, Michael S. Tsirkin, Eli Cohen, netdev

On Tue, Dec 1, 2020 at 7:32 PM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Yongji Xie <xieyongji@bytedance.com>
> > Sent: Tuesday, December 1, 2020 3:26 PM
> >
> > On Tue, Dec 1, 2020 at 2:25 PM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > >
> > > On 2020/11/30 下午3:07, Yongji Xie wrote:
> > > >>> Thanks for adding me, Jason!
> > > >>>
> > > >>> Now I'm working on a v2 patchset for VDUSE (vDPA Device in
> > > >>> Userspace) [1]. This tool is very useful for the vduse device. So
> > > >>> I'm considering integrating this into my v2 patchset. But there is
> > > >>> one problem:
> > > >>>
> > > >>> In this tool, vdpa device config action and enable action are
> > > >>> combined into one netlink msg: VDPA_CMD_DEV_NEW. But in vduse
> > > >>> case, it needs to be splitted because a chardev should be created
> > > >>> and opened by a userspace process before we enable the vdpa device
> > > >>> (call vdpa_register_device()).
> > > >>>
> > > >>> So I'd like to know whether it's possible (or have some plans) to
> > > >>> add two new netlink msgs something like: VDPA_CMD_DEV_ENABLE
> > and
> > > >>> VDPA_CMD_DEV_DISABLE to make the config path more flexible.
> > > >>>
> > > >> Actually, we've discussed such intermediate step in some early
> > > >> discussion. It looks to me VDUSE could be one of the users of this.
> > > >>
> > > >> Or I wonder whether we can switch to use anonymous inode(fd) for
> > > >> VDUSE then fetching it via an VDUSE_GET_DEVICE_FD ioctl?
> > > >>
> > > > Yes, we can. Actually the current implementation in VDUSE is like
> > > > this.  But seems like this is still a intermediate step. The fd
> > > > should be binded to a name or something else which need to be
> > > > configured before.
> > >
> > >
> > > The name could be specified via the netlink. It looks to me the real
> > > issue is that until the device is connected with a userspace, it can't
> > > be used. So we also need to fail the enabling if it doesn't opened.
> > >
> >
> > Yes, that's true. So you mean we can firstly try to fetch the fd binded to a
> > name/vduse_id via an VDUSE_GET_DEVICE_FD, then use the
> > name/vduse_id as a attribute to create vdpa device? It looks fine to me.
>
> I probably do not well understand. I tried reading patch [1] and few things do not look correct as below.
> Creating the vdpa device on the bus device and destroying the device from the workqueue seems unnecessary and racy.
>
> It seems vduse driver needs
> This is something should be done as part of the vdpa dev add command, instead of connecting two sides separately and ensuring race free access to it.
>
> So VDUSE_DEV_START and VDUSE_DEV_STOP should possibly be avoided.
>

Yes, we can avoid these two ioctls with the help of the management tool.

> $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
>
> When above command is executed it creates necessary vdpa device foo2 on the bus.
> When user binds foo2 device with the vduse driver, in the probe(), it creates respective char device to access it from user space.

But vduse driver is not a vdpa bus driver. It works like vdpasim
driver, but offloads the data plane and control plane to a user space
process.

Thanks,
Yongji

^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-01 14:18               ` Yongji Xie
@ 2020-12-01 15:58                 ` Parav Pandit
  2020-12-02  3:29                   ` Yongji Xie
  0 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2020-12-01 15:58 UTC (permalink / raw)
  To: Yongji Xie
  Cc: Jason Wang, virtualization, Michael S. Tsirkin, Eli Cohen, netdev



> From: Yongji Xie <xieyongji@bytedance.com>
> Sent: Tuesday, December 1, 2020 7:49 PM
> 
> On Tue, Dec 1, 2020 at 7:32 PM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> >
> > > From: Yongji Xie <xieyongji@bytedance.com>
> > > Sent: Tuesday, December 1, 2020 3:26 PM
> > >
> > > On Tue, Dec 1, 2020 at 2:25 PM Jason Wang <jasowang@redhat.com>
> wrote:
> > > >
> > > >
> > > > On 2020/11/30 下午3:07, Yongji Xie wrote:
> > > > >>> Thanks for adding me, Jason!
> > > > >>>
> > > > >>> Now I'm working on a v2 patchset for VDUSE (vDPA Device in
> > > > >>> Userspace) [1]. This tool is very useful for the vduse device.
> > > > >>> So I'm considering integrating this into my v2 patchset. But
> > > > >>> there is one problem:
> > > > >>>
> > > > >>> In this tool, vdpa device config action and enable action are
> > > > >>> combined into one netlink msg: VDPA_CMD_DEV_NEW. But in
> vduse
> > > > >>> case, it needs to be splitted because a chardev should be
> > > > >>> created and opened by a userspace process before we enable the
> > > > >>> vdpa device (call vdpa_register_device()).
> > > > >>>
> > > > >>> So I'd like to know whether it's possible (or have some plans)
> > > > >>> to add two new netlink msgs something like:
> > > > >>> VDPA_CMD_DEV_ENABLE
> > > and
> > > > >>> VDPA_CMD_DEV_DISABLE to make the config path more flexible.
> > > > >>>
> > > > >> Actually, we've discussed such intermediate step in some early
> > > > >> discussion. It looks to me VDUSE could be one of the users of this.
> > > > >>
> > > > >> Or I wonder whether we can switch to use anonymous inode(fd)
> > > > >> for VDUSE then fetching it via an VDUSE_GET_DEVICE_FD ioctl?
> > > > >>
> > > > > Yes, we can. Actually the current implementation in VDUSE is
> > > > > like this.  But seems like this is still a intermediate step.
> > > > > The fd should be binded to a name or something else which need
> > > > > to be configured before.
> > > >
> > > >
> > > > The name could be specified via the netlink. It looks to me the
> > > > real issue is that until the device is connected with a userspace,
> > > > it can't be used. So we also need to fail the enabling if it doesn't
> opened.
> > > >
> > >
> > > Yes, that's true. So you mean we can firstly try to fetch the fd
> > > binded to a name/vduse_id via an VDUSE_GET_DEVICE_FD, then use the
> > > name/vduse_id as a attribute to create vdpa device? It looks fine to me.
> >
> > I probably do not well understand. I tried reading patch [1] and few things
> do not look correct as below.
> > Creating the vdpa device on the bus device and destroying the device from
> the workqueue seems unnecessary and racy.
> >
> > It seems vduse driver needs
> > This is something should be done as part of the vdpa dev add command,
> instead of connecting two sides separately and ensuring race free access to
> it.
> >
> > So VDUSE_DEV_START and VDUSE_DEV_STOP should possibly be avoided.
> >
> 
> Yes, we can avoid these two ioctls with the help of the management tool.
> 
> > $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> >
> > When above command is executed it creates necessary vdpa device foo2
> on the bus.
> > When user binds foo2 device with the vduse driver, in the probe(), it
> creates respective char device to access it from user space.
>
I see. So vduse cannot work with any existing vdpa devices like ifc, mlx5 or netdevsim.
It has its own implementation similar to fuse with its own backend of choice.
More below.

> But vduse driver is not a vdpa bus driver. It works like vdpasim driver, but
> offloads the data plane and control plane to a user space process.

In that case to draw parallel lines,

1. netdevsim:
(a) create resources in kernel sw
(b) datapath simulates in kernel

2. ifc + mlx5 vdpa dev:
(a) creates resource in hw
(b) data path is in hw

3. vduse:
(a) creates resources in userspace sw
(b) data path is in user space.
hence creates data path resources for user space.
So char device is created, removed as result of vdpa device creation.

For example,
$ vdpa dev add parentdev vduse_mgmtdev type net name foo2

Above command will create char device for user space.

Similar command for ifc/mlx5 would have created similar channel for rest of the config commands in hw.
vduse channel = char device, eventfd etc.
ifc/mlx5 hw channel = bar, irq, command interface etc
Netdev sim channel = sw direct calls

Does it make sense?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-01 15:58                 ` Parav Pandit
@ 2020-12-02  3:29                   ` Yongji Xie
  2020-12-02  4:53                     ` Parav Pandit
  0 siblings, 1 reply; 79+ messages in thread
From: Yongji Xie @ 2020-12-02  3:29 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtualization, Michael S. Tsirkin, Eli Cohen, netdev

On Tue, Dec 1, 2020 at 11:59 PM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Yongji Xie <xieyongji@bytedance.com>
> > Sent: Tuesday, December 1, 2020 7:49 PM
> >
> > On Tue, Dec 1, 2020 at 7:32 PM Parav Pandit <parav@nvidia.com> wrote:
> > >
> > >
> > >
> > > > From: Yongji Xie <xieyongji@bytedance.com>
> > > > Sent: Tuesday, December 1, 2020 3:26 PM
> > > >
> > > > On Tue, Dec 1, 2020 at 2:25 PM Jason Wang <jasowang@redhat.com>
> > wrote:
> > > > >
> > > > >
> > > > > On 2020/11/30 下午3:07, Yongji Xie wrote:
> > > > > >>> Thanks for adding me, Jason!
> > > > > >>>
> > > > > >>> Now I'm working on a v2 patchset for VDUSE (vDPA Device in
> > > > > >>> Userspace) [1]. This tool is very useful for the vduse device.
> > > > > >>> So I'm considering integrating this into my v2 patchset. But
> > > > > >>> there is one problem:
> > > > > >>>
> > > > > >>> In this tool, vdpa device config action and enable action are
> > > > > >>> combined into one netlink msg: VDPA_CMD_DEV_NEW. But in
> > vduse
> > > > > >>> case, it needs to be splitted because a chardev should be
> > > > > >>> created and opened by a userspace process before we enable the
> > > > > >>> vdpa device (call vdpa_register_device()).
> > > > > >>>
> > > > > >>> So I'd like to know whether it's possible (or have some plans)
> > > > > >>> to add two new netlink msgs something like:
> > > > > >>> VDPA_CMD_DEV_ENABLE
> > > > and
> > > > > >>> VDPA_CMD_DEV_DISABLE to make the config path more flexible.
> > > > > >>>
> > > > > >> Actually, we've discussed such intermediate step in some early
> > > > > >> discussion. It looks to me VDUSE could be one of the users of this.
> > > > > >>
> > > > > >> Or I wonder whether we can switch to use anonymous inode(fd)
> > > > > >> for VDUSE then fetching it via an VDUSE_GET_DEVICE_FD ioctl?
> > > > > >>
> > > > > > Yes, we can. Actually the current implementation in VDUSE is
> > > > > > like this.  But seems like this is still a intermediate step.
> > > > > > The fd should be binded to a name or something else which need
> > > > > > to be configured before.
> > > > >
> > > > >
> > > > > The name could be specified via the netlink. It looks to me the
> > > > > real issue is that until the device is connected with a userspace,
> > > > > it can't be used. So we also need to fail the enabling if it doesn't
> > opened.
> > > > >
> > > >
> > > > Yes, that's true. So you mean we can firstly try to fetch the fd
> > > > binded to a name/vduse_id via an VDUSE_GET_DEVICE_FD, then use the
> > > > name/vduse_id as a attribute to create vdpa device? It looks fine to me.
> > >
> > > I probably do not well understand. I tried reading patch [1] and few things
> > do not look correct as below.
> > > Creating the vdpa device on the bus device and destroying the device from
> > the workqueue seems unnecessary and racy.
> > >
> > > It seems vduse driver needs
> > > This is something should be done as part of the vdpa dev add command,
> > instead of connecting two sides separately and ensuring race free access to
> > it.
> > >
> > > So VDUSE_DEV_START and VDUSE_DEV_STOP should possibly be avoided.
> > >
> >
> > Yes, we can avoid these two ioctls with the help of the management tool.
> >
> > > $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> > >
> > > When above command is executed it creates necessary vdpa device foo2
> > on the bus.
> > > When user binds foo2 device with the vduse driver, in the probe(), it
> > creates respective char device to access it from user space.
> >
> I see. So vduse cannot work with any existing vdpa devices like ifc, mlx5 or netdevsim.
> It has its own implementation similar to fuse with its own backend of choice.
> More below.
>
> > But vduse driver is not a vdpa bus driver. It works like vdpasim driver, but
> > offloads the data plane and control plane to a user space process.
>
> In that case to draw parallel lines,
>
> 1. netdevsim:
> (a) create resources in kernel sw
> (b) datapath simulates in kernel
>
> 2. ifc + mlx5 vdpa dev:
> (a) creates resource in hw
> (b) data path is in hw
>
> 3. vduse:
> (a) creates resources in userspace sw
> (b) data path is in user space.
> hence creates data path resources for user space.
> So char device is created, removed as result of vdpa device creation.
>
> For example,
> $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
>
> Above command will create char device for user space.
>
> Similar command for ifc/mlx5 would have created similar channel for rest of the config commands in hw.
> vduse channel = char device, eventfd etc.
> ifc/mlx5 hw channel = bar, irq, command interface etc
> Netdev sim channel = sw direct calls
>
> Does it make sense?

In my understanding, to make vdpa work, we need a backend (datapath
resources) and a frontend (a vdpa device attached to a vdpa bus). In
the above example, it looks like we use the command "vdpa dev add ..."
 to create a backend, so do we need another command to create a
frontend?

Thanks,
Yongji

^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-02  3:29                   ` Yongji Xie
@ 2020-12-02  4:53                     ` Parav Pandit
  2020-12-02  5:51                       ` Jason Wang
  2020-12-02  9:21                       ` Yongji Xie
  0 siblings, 2 replies; 79+ messages in thread
From: Parav Pandit @ 2020-12-02  4:53 UTC (permalink / raw)
  To: Yongji Xie
  Cc: Jason Wang, virtualization, Michael S. Tsirkin, Eli Cohen, netdev



> From: Yongji Xie <xieyongji@bytedance.com>
> Sent: Wednesday, December 2, 2020 9:00 AM
> 
> On Tue, Dec 1, 2020 at 11:59 PM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> >
> > > From: Yongji Xie <xieyongji@bytedance.com>
> > > Sent: Tuesday, December 1, 2020 7:49 PM
> > >
> > > On Tue, Dec 1, 2020 at 7:32 PM Parav Pandit <parav@nvidia.com> wrote:
> > > >
> > > >
> > > >
> > > > > From: Yongji Xie <xieyongji@bytedance.com>
> > > > > Sent: Tuesday, December 1, 2020 3:26 PM
> > > > >
> > > > > On Tue, Dec 1, 2020 at 2:25 PM Jason Wang <jasowang@redhat.com>
> > > wrote:
> > > > > >
> > > > > >
> > > > > > On 2020/11/30 下午3:07, Yongji Xie wrote:
> > > > > > >>> Thanks for adding me, Jason!
> > > > > > >>>
> > > > > > >>> Now I'm working on a v2 patchset for VDUSE (vDPA Device in
> > > > > > >>> Userspace) [1]. This tool is very useful for the vduse device.
> > > > > > >>> So I'm considering integrating this into my v2 patchset.
> > > > > > >>> But there is one problem:
> > > > > > >>>
> > > > > > >>> In this tool, vdpa device config action and enable action
> > > > > > >>> are combined into one netlink msg: VDPA_CMD_DEV_NEW. But
> > > > > > >>> in
> > > vduse
> > > > > > >>> case, it needs to be splitted because a chardev should be
> > > > > > >>> created and opened by a userspace process before we enable
> > > > > > >>> the vdpa device (call vdpa_register_device()).
> > > > > > >>>
> > > > > > >>> So I'd like to know whether it's possible (or have some
> > > > > > >>> plans) to add two new netlink msgs something like:
> > > > > > >>> VDPA_CMD_DEV_ENABLE
> > > > > and
> > > > > > >>> VDPA_CMD_DEV_DISABLE to make the config path more flexible.
> > > > > > >>>
> > > > > > >> Actually, we've discussed such intermediate step in some
> > > > > > >> early discussion. It looks to me VDUSE could be one of the users of
> this.
> > > > > > >>
> > > > > > >> Or I wonder whether we can switch to use anonymous
> > > > > > >> inode(fd) for VDUSE then fetching it via an VDUSE_GET_DEVICE_FD
> ioctl?
> > > > > > >>
> > > > > > > Yes, we can. Actually the current implementation in VDUSE is
> > > > > > > like this.  But seems like this is still a intermediate step.
> > > > > > > The fd should be binded to a name or something else which
> > > > > > > need to be configured before.
> > > > > >
> > > > > >
> > > > > > The name could be specified via the netlink. It looks to me
> > > > > > the real issue is that until the device is connected with a
> > > > > > userspace, it can't be used. So we also need to fail the
> > > > > > enabling if it doesn't
> > > opened.
> > > > > >
> > > > >
> > > > > Yes, that's true. So you mean we can firstly try to fetch the fd
> > > > > binded to a name/vduse_id via an VDUSE_GET_DEVICE_FD, then use
> > > > > the name/vduse_id as a attribute to create vdpa device? It looks fine to
> me.
> > > >
> > > > I probably do not well understand. I tried reading patch [1] and
> > > > few things
> > > do not look correct as below.
> > > > Creating the vdpa device on the bus device and destroying the
> > > > device from
> > > the workqueue seems unnecessary and racy.
> > > >
> > > > It seems vduse driver needs
> > > > This is something should be done as part of the vdpa dev add
> > > > command,
> > > instead of connecting two sides separately and ensuring race free
> > > access to it.
> > > >
> > > > So VDUSE_DEV_START and VDUSE_DEV_STOP should possibly be avoided.
> > > >
> > >
> > > Yes, we can avoid these two ioctls with the help of the management tool.
> > >
> > > > $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> > > >
> > > > When above command is executed it creates necessary vdpa device
> > > > foo2
> > > on the bus.
> > > > When user binds foo2 device with the vduse driver, in the probe(),
> > > > it
> > > creates respective char device to access it from user space.
> > >
> > I see. So vduse cannot work with any existing vdpa devices like ifc, mlx5 or
> netdevsim.
> > It has its own implementation similar to fuse with its own backend of choice.
> > More below.
> >
> > > But vduse driver is not a vdpa bus driver. It works like vdpasim
> > > driver, but offloads the data plane and control plane to a user space process.
> >
> > In that case to draw parallel lines,
> >
> > 1. netdevsim:
> > (a) create resources in kernel sw
> > (b) datapath simulates in kernel
> >
> > 2. ifc + mlx5 vdpa dev:
> > (a) creates resource in hw
> > (b) data path is in hw
> >
> > 3. vduse:
> > (a) creates resources in userspace sw
> > (b) data path is in user space.
> > hence creates data path resources for user space.
> > So char device is created, removed as result of vdpa device creation.
> >
> > For example,
> > $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> >
> > Above command will create char device for user space.
> >
> > Similar command for ifc/mlx5 would have created similar channel for rest of
> the config commands in hw.
> > vduse channel = char device, eventfd etc.
> > ifc/mlx5 hw channel = bar, irq, command interface etc Netdev sim
> > channel = sw direct calls
> >
> > Does it make sense?
> 
> In my understanding, to make vdpa work, we need a backend (datapath
> resources) and a frontend (a vdpa device attached to a vdpa bus). In the above
> example, it looks like we use the command "vdpa dev add ..."
>  to create a backend, so do we need another command to create a frontend?
> 
For block device there is certainly some backend to process the IOs.
Sometimes backend to be setup first, before its front end is exposed.
"vdpa dev add" is the front end command who connects to the backend (implicitly) for network device.

vhost->vdpa_block_device->backend_io_processor (usr,hw,kernel).

And it needs a way to connect to backend when explicitly specified during creation time.
Something like,
$ vdpa dev add parentdev vdpa_vduse type block name foo3 handle <uuid>
In above example some vendor device specific unique handle is passed based on backend setup in hardware/user space.

In below 3 examples, vdpa block simulator is connecting to backend block or file.

$ vdpa dev add parentdev vdpa_blocksim type block name foo4 blockdev /dev/zero

$ vdpa dev add parentdev vdpa_blocksim type block name foo5 blockdev /dev/sda2 size=100M offset=10M

$ vdpa dev add parentdev vdpa_block filebackend_sim type block name foo6 file /root/file_backend.txt

Or may be backend connects to the created vdpa device is bound to the driver.
Can vduse attach to the created vdpa block device through the char device and establish the channel to receive IOs, and to setup the block config space?

> Thanks,
> Yongji

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-01  9:55           ` Yongji Xie
  2020-12-01 11:32             ` Parav Pandit
@ 2020-12-02  5:48             ` Jason Wang
  1 sibling, 0 replies; 79+ messages in thread
From: Jason Wang @ 2020-12-02  5:48 UTC (permalink / raw)
  To: Yongji Xie; +Cc: Parav Pandit, virtualization, Michael S. Tsirkin, elic, netdev


On 2020/12/1 下午5:55, Yongji Xie wrote:
> On Tue, Dec 1, 2020 at 2:25 PM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2020/11/30 下午3:07, Yongji Xie wrote:
>>>>> Thanks for adding me, Jason!
>>>>>
>>>>> Now I'm working on a v2 patchset for VDUSE (vDPA Device in Userspace)
>>>>> [1]. This tool is very useful for the vduse device. So I'm considering
>>>>> integrating this into my v2 patchset. But there is one problem:
>>>>>
>>>>> In this tool, vdpa device config action and enable action are combined
>>>>> into one netlink msg: VDPA_CMD_DEV_NEW. But in vduse case, it needs to
>>>>> be splitted because a chardev should be created and opened by a
>>>>> userspace process before we enable the vdpa device (call
>>>>> vdpa_register_device()).
>>>>>
>>>>> So I'd like to know whether it's possible (or have some plans) to add
>>>>> two new netlink msgs something like: VDPA_CMD_DEV_ENABLE and
>>>>> VDPA_CMD_DEV_DISABLE to make the config path more flexible.
>>>>>
>>>> Actually, we've discussed such intermediate step in some early
>>>> discussion. It looks to me VDUSE could be one of the users of this.
>>>>
>>>> Or I wonder whether we can switch to use anonymous inode(fd) for VDUSE
>>>> then fetching it via an VDUSE_GET_DEVICE_FD ioctl?
>>>>
>>> Yes, we can. Actually the current implementation in VDUSE is like
>>> this.  But seems like this is still a intermediate step. The fd should
>>> be binded to a name or something else which need to be configured
>>> before.
>>
>> The name could be specified via the netlink. It looks to me the real
>> issue is that until the device is connected with a userspace, it can't
>> be used. So we also need to fail the enabling if it doesn't opened.
>>
> Yes, that's true. So you mean we can firstly try to fetch the fd
> binded to a name/vduse_id via an VDUSE_GET_DEVICE_FD, then use the
> name/vduse_id as a attribute to create vdpa device? It looks fine to
> me.


Yes, something like this. The anonymous fd will be created during 
dev_add() and the fd will be carried in the msg to userspace.

Thanks


>
> Thanks,
> Yongji
>


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-02  4:53                     ` Parav Pandit
@ 2020-12-02  5:51                       ` Jason Wang
  2020-12-02  6:24                         ` Parav Pandit
  2020-12-02  9:27                         ` Yongji Xie
  2020-12-02  9:21                       ` Yongji Xie
  1 sibling, 2 replies; 79+ messages in thread
From: Jason Wang @ 2020-12-02  5:51 UTC (permalink / raw)
  To: Parav Pandit, Yongji Xie
  Cc: virtualization, Michael S. Tsirkin, Eli Cohen, netdev


On 2020/12/2 下午12:53, Parav Pandit wrote:
>
>> From: Yongji Xie <xieyongji@bytedance.com>
>> Sent: Wednesday, December 2, 2020 9:00 AM
>>
>> On Tue, Dec 1, 2020 at 11:59 PM Parav Pandit <parav@nvidia.com> wrote:
>>>
>>>
>>>> From: Yongji Xie <xieyongji@bytedance.com>
>>>> Sent: Tuesday, December 1, 2020 7:49 PM
>>>>
>>>> On Tue, Dec 1, 2020 at 7:32 PM Parav Pandit <parav@nvidia.com> wrote:
>>>>>
>>>>>
>>>>>> From: Yongji Xie <xieyongji@bytedance.com>
>>>>>> Sent: Tuesday, December 1, 2020 3:26 PM
>>>>>>
>>>>>> On Tue, Dec 1, 2020 at 2:25 PM Jason Wang <jasowang@redhat.com>
>>>> wrote:
>>>>>>>
>>>>>>> On 2020/11/30 下午3:07, Yongji Xie wrote:
>>>>>>>>>> Thanks for adding me, Jason!
>>>>>>>>>>
>>>>>>>>>> Now I'm working on a v2 patchset for VDUSE (vDPA Device in
>>>>>>>>>> Userspace) [1]. This tool is very useful for the vduse device.
>>>>>>>>>> So I'm considering integrating this into my v2 patchset.
>>>>>>>>>> But there is one problem:
>>>>>>>>>>
>>>>>>>>>> In this tool, vdpa device config action and enable action
>>>>>>>>>> are combined into one netlink msg: VDPA_CMD_DEV_NEW. But
>>>>>>>>>> in
>>>> vduse
>>>>>>>>>> case, it needs to be splitted because a chardev should be
>>>>>>>>>> created and opened by a userspace process before we enable
>>>>>>>>>> the vdpa device (call vdpa_register_device()).
>>>>>>>>>>
>>>>>>>>>> So I'd like to know whether it's possible (or have some
>>>>>>>>>> plans) to add two new netlink msgs something like:
>>>>>>>>>> VDPA_CMD_DEV_ENABLE
>>>>>> and
>>>>>>>>>> VDPA_CMD_DEV_DISABLE to make the config path more flexible.
>>>>>>>>>>
>>>>>>>>> Actually, we've discussed such intermediate step in some
>>>>>>>>> early discussion. It looks to me VDUSE could be one of the users of
>> this.
>>>>>>>>> Or I wonder whether we can switch to use anonymous
>>>>>>>>> inode(fd) for VDUSE then fetching it via an VDUSE_GET_DEVICE_FD
>> ioctl?
>>>>>>>> Yes, we can. Actually the current implementation in VDUSE is
>>>>>>>> like this.  But seems like this is still a intermediate step.
>>>>>>>> The fd should be binded to a name or something else which
>>>>>>>> need to be configured before.
>>>>>>>
>>>>>>> The name could be specified via the netlink. It looks to me
>>>>>>> the real issue is that until the device is connected with a
>>>>>>> userspace, it can't be used. So we also need to fail the
>>>>>>> enabling if it doesn't
>>>> opened.
>>>>>> Yes, that's true. So you mean we can firstly try to fetch the fd
>>>>>> binded to a name/vduse_id via an VDUSE_GET_DEVICE_FD, then use
>>>>>> the name/vduse_id as a attribute to create vdpa device? It looks fine to
>> me.
>>>>> I probably do not well understand. I tried reading patch [1] and
>>>>> few things
>>>> do not look correct as below.
>>>>> Creating the vdpa device on the bus device and destroying the
>>>>> device from
>>>> the workqueue seems unnecessary and racy.
>>>>> It seems vduse driver needs
>>>>> This is something should be done as part of the vdpa dev add
>>>>> command,
>>>> instead of connecting two sides separately and ensuring race free
>>>> access to it.
>>>>> So VDUSE_DEV_START and VDUSE_DEV_STOP should possibly be avoided.
>>>>>
>>>> Yes, we can avoid these two ioctls with the help of the management tool.
>>>>
>>>>> $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
>>>>>
>>>>> When above command is executed it creates necessary vdpa device
>>>>> foo2
>>>> on the bus.
>>>>> When user binds foo2 device with the vduse driver, in the probe(),
>>>>> it
>>>> creates respective char device to access it from user space.
>>>>
>>> I see. So vduse cannot work with any existing vdpa devices like ifc, mlx5 or
>> netdevsim.
>>> It has its own implementation similar to fuse with its own backend of choice.
>>> More below.
>>>
>>>> But vduse driver is not a vdpa bus driver. It works like vdpasim
>>>> driver, but offloads the data plane and control plane to a user space process.
>>> In that case to draw parallel lines,
>>>
>>> 1. netdevsim:
>>> (a) create resources in kernel sw
>>> (b) datapath simulates in kernel
>>>
>>> 2. ifc + mlx5 vdpa dev:
>>> (a) creates resource in hw
>>> (b) data path is in hw
>>>
>>> 3. vduse:
>>> (a) creates resources in userspace sw
>>> (b) data path is in user space.
>>> hence creates data path resources for user space.
>>> So char device is created, removed as result of vdpa device creation.
>>>
>>> For example,
>>> $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
>>>
>>> Above command will create char device for user space.
>>>
>>> Similar command for ifc/mlx5 would have created similar channel for rest of
>> the config commands in hw.
>>> vduse channel = char device, eventfd etc.
>>> ifc/mlx5 hw channel = bar, irq, command interface etc Netdev sim
>>> channel = sw direct calls
>>>
>>> Does it make sense?
>> In my understanding, to make vdpa work, we need a backend (datapath
>> resources) and a frontend (a vdpa device attached to a vdpa bus). In the above
>> example, it looks like we use the command "vdpa dev add ..."
>>   to create a backend, so do we need another command to create a frontend?
>>
> For block device there is certainly some backend to process the IOs.
> Sometimes backend to be setup first, before its front end is exposed.
> "vdpa dev add" is the front end command who connects to the backend (implicitly) for network device.
>
> vhost->vdpa_block_device->backend_io_processor (usr,hw,kernel).
>
> And it needs a way to connect to backend when explicitly specified during creation time.
> Something like,
> $ vdpa dev add parentdev vdpa_vduse type block name foo3 handle <uuid>
> In above example some vendor device specific unique handle is passed based on backend setup in hardware/user space.
>
> In below 3 examples, vdpa block simulator is connecting to backend block or file.
>
> $ vdpa dev add parentdev vdpa_blocksim type block name foo4 blockdev /dev/zero
>
> $ vdpa dev add parentdev vdpa_blocksim type block name foo5 blockdev /dev/sda2 size=100M offset=10M
>
> $ vdpa dev add parentdev vdpa_block filebackend_sim type block name foo6 file /root/file_backend.txt
>
> Or may be backend connects to the created vdpa device is bound to the driver.
> Can vduse attach to the created vdpa block device through the char device and establish the channel to receive IOs, and to setup the block config space?


I think it can work.

Another thing I wonder it that, do we consider more than one VDUSE 
parentdev(or management dev)? This allows us to have separated devices 
implemented via different processes.

If yes, VDUSE ioctl needs to be extended to register/unregister parentdev.

Thanks


>
>> Thanks,
>> Yongji


^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-02  5:51                       ` Jason Wang
@ 2020-12-02  6:24                         ` Parav Pandit
  2020-12-02  7:55                           ` Jason Wang
  2020-12-02  9:27                         ` Yongji Xie
  1 sibling, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2020-12-02  6:24 UTC (permalink / raw)
  To: Jason Wang, Yongji Xie
  Cc: virtualization, Michael S. Tsirkin, Eli Cohen, netdev



> From: Jason Wang <jasowang@redhat.com>
> Sent: Wednesday, December 2, 2020 11:21 AM
> 
> On 2020/12/2 下午12:53, Parav Pandit wrote:
> >
> >> From: Yongji Xie <xieyongji@bytedance.com>
> >> Sent: Wednesday, December 2, 2020 9:00 AM
> >>
> >> On Tue, Dec 1, 2020 at 11:59 PM Parav Pandit <parav@nvidia.com> wrote:
> >>>
> >>>
> >>>> From: Yongji Xie <xieyongji@bytedance.com>
> >>>> Sent: Tuesday, December 1, 2020 7:49 PM
> >>>>
> >>>> On Tue, Dec 1, 2020 at 7:32 PM Parav Pandit <parav@nvidia.com>
> wrote:
> >>>>>
> >>>>>
> >>>>>> From: Yongji Xie <xieyongji@bytedance.com>
> >>>>>> Sent: Tuesday, December 1, 2020 3:26 PM
> >>>>>>
> >>>>>> On Tue, Dec 1, 2020 at 2:25 PM Jason Wang
> <jasowang@redhat.com>
> >>>> wrote:
> >>>>>>>
> >>>>>>> On 2020/11/30 下午3:07, Yongji Xie wrote:
> >>>>>>>>>> Thanks for adding me, Jason!
> >>>>>>>>>>
> >>>>>>>>>> Now I'm working on a v2 patchset for VDUSE (vDPA Device in
> >>>>>>>>>> Userspace) [1]. This tool is very useful for the vduse device.
> >>>>>>>>>> So I'm considering integrating this into my v2 patchset.
> >>>>>>>>>> But there is one problem:
> >>>>>>>>>>
> >>>>>>>>>> In this tool, vdpa device config action and enable action are
> >>>>>>>>>> combined into one netlink msg: VDPA_CMD_DEV_NEW. But in
> >>>> vduse
> >>>>>>>>>> case, it needs to be splitted because a chardev should be
> >>>>>>>>>> created and opened by a userspace process before we enable
> >>>>>>>>>> the vdpa device (call vdpa_register_device()).
> >>>>>>>>>>
> >>>>>>>>>> So I'd like to know whether it's possible (or have some
> >>>>>>>>>> plans) to add two new netlink msgs something like:
> >>>>>>>>>> VDPA_CMD_DEV_ENABLE
> >>>>>> and
> >>>>>>>>>> VDPA_CMD_DEV_DISABLE to make the config path more
> flexible.
> >>>>>>>>>>
> >>>>>>>>> Actually, we've discussed such intermediate step in some early
> >>>>>>>>> discussion. It looks to me VDUSE could be one of the users of
> >> this.
> >>>>>>>>> Or I wonder whether we can switch to use anonymous
> >>>>>>>>> inode(fd) for VDUSE then fetching it via an
> >>>>>>>>> VDUSE_GET_DEVICE_FD
> >> ioctl?
> >>>>>>>> Yes, we can. Actually the current implementation in VDUSE is
> >>>>>>>> like this.  But seems like this is still a intermediate step.
> >>>>>>>> The fd should be binded to a name or something else which need
> >>>>>>>> to be configured before.
> >>>>>>>
> >>>>>>> The name could be specified via the netlink. It looks to me the
> >>>>>>> real issue is that until the device is connected with a
> >>>>>>> userspace, it can't be used. So we also need to fail the
> >>>>>>> enabling if it doesn't
> >>>> opened.
> >>>>>> Yes, that's true. So you mean we can firstly try to fetch the fd
> >>>>>> binded to a name/vduse_id via an VDUSE_GET_DEVICE_FD, then
> use
> >>>>>> the name/vduse_id as a attribute to create vdpa device? It looks
> >>>>>> fine to
> >> me.
> >>>>> I probably do not well understand. I tried reading patch [1] and
> >>>>> few things
> >>>> do not look correct as below.
> >>>>> Creating the vdpa device on the bus device and destroying the
> >>>>> device from
> >>>> the workqueue seems unnecessary and racy.
> >>>>> It seems vduse driver needs
> >>>>> This is something should be done as part of the vdpa dev add
> >>>>> command,
> >>>> instead of connecting two sides separately and ensuring race free
> >>>> access to it.
> >>>>> So VDUSE_DEV_START and VDUSE_DEV_STOP should possibly be
> avoided.
> >>>>>
> >>>> Yes, we can avoid these two ioctls with the help of the management
> tool.
> >>>>
> >>>>> $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> >>>>>
> >>>>> When above command is executed it creates necessary vdpa device
> >>>>> foo2
> >>>> on the bus.
> >>>>> When user binds foo2 device with the vduse driver, in the probe(),
> >>>>> it
> >>>> creates respective char device to access it from user space.
> >>>>
> >>> I see. So vduse cannot work with any existing vdpa devices like ifc,
> >>> mlx5 or
> >> netdevsim.
> >>> It has its own implementation similar to fuse with its own backend of
> choice.
> >>> More below.
> >>>
> >>>> But vduse driver is not a vdpa bus driver. It works like vdpasim
> >>>> driver, but offloads the data plane and control plane to a user space
> process.
> >>> In that case to draw parallel lines,
> >>>
> >>> 1. netdevsim:
> >>> (a) create resources in kernel sw
> >>> (b) datapath simulates in kernel
> >>>
> >>> 2. ifc + mlx5 vdpa dev:
> >>> (a) creates resource in hw
> >>> (b) data path is in hw
> >>>
> >>> 3. vduse:
> >>> (a) creates resources in userspace sw
> >>> (b) data path is in user space.
> >>> hence creates data path resources for user space.
> >>> So char device is created, removed as result of vdpa device creation.
> >>>
> >>> For example,
> >>> $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> >>>
> >>> Above command will create char device for user space.
> >>>
> >>> Similar command for ifc/mlx5 would have created similar channel for
> >>> rest of
> >> the config commands in hw.
> >>> vduse channel = char device, eventfd etc.
> >>> ifc/mlx5 hw channel = bar, irq, command interface etc Netdev sim
> >>> channel = sw direct calls
> >>>
> >>> Does it make sense?
> >> In my understanding, to make vdpa work, we need a backend (datapath
> >> resources) and a frontend (a vdpa device attached to a vdpa bus). In
> >> the above example, it looks like we use the command "vdpa dev add ..."
> >>   to create a backend, so do we need another command to create a
> frontend?
> >>
> > For block device there is certainly some backend to process the IOs.
> > Sometimes backend to be setup first, before its front end is exposed.
> > "vdpa dev add" is the front end command who connects to the backend
> (implicitly) for network device.
> >
> > vhost->vdpa_block_device->backend_io_processor (usr,hw,kernel).
> >
> > And it needs a way to connect to backend when explicitly specified during
> creation time.
> > Something like,
> > $ vdpa dev add parentdev vdpa_vduse type block name foo3 handle
> <uuid>
> > In above example some vendor device specific unique handle is passed
> based on backend setup in hardware/user space.
> >
> > In below 3 examples, vdpa block simulator is connecting to backend block
> or file.
> >
> > $ vdpa dev add parentdev vdpa_blocksim type block name foo4 blockdev
> > /dev/zero
> >
> > $ vdpa dev add parentdev vdpa_blocksim type block name foo5 blockdev
> > /dev/sda2 size=100M offset=10M
> >
> > $ vdpa dev add parentdev vdpa_block filebackend_sim type block name
> > foo6 file /root/file_backend.txt
> >
> > Or may be backend connects to the created vdpa device is bound to the
> driver.
> > Can vduse attach to the created vdpa block device through the char device
> and establish the channel to receive IOs, and to setup the block config space?
> 
> 
> I think it can work.
> 
> Another thing I wonder it that, do we consider more than one VDUSE
> parentdev(or management dev)? This allows us to have separated devices
> implemented via different processes.
Multiple parentdev should be possible per one driver. for example mlx5_vdpa.ko will create multiple parent dev, one for each PCI VFs, SFs.
vdpa dev add can certainly use one parent/mgmt dev to create multiple vdpa devices.
Not sure why do we need to create multiple parent dev for that.
I guess there is just one parent/mgmt. dev for VDUSE. What will each mgmtdev do differently?
Demux of IOs, events will be per individual char dev level?

> 
> If yes, VDUSE ioctl needs to be extended to register/unregister parentdev.
> 
> Thanks
> 
> 
> >
> >> Thanks,
> >> Yongji


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-02  6:24                         ` Parav Pandit
@ 2020-12-02  7:55                           ` Jason Wang
  0 siblings, 0 replies; 79+ messages in thread
From: Jason Wang @ 2020-12-02  7:55 UTC (permalink / raw)
  To: Parav Pandit, Yongji Xie
  Cc: virtualization, Michael S. Tsirkin, Eli Cohen, netdev


On 2020/12/2 下午2:24, Parav Pandit wrote:
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Wednesday, December 2, 2020 11:21 AM
>>
>> On 2020/12/2 下午12:53, Parav Pandit wrote:
>>>> From: Yongji Xie <xieyongji@bytedance.com>
>>>> Sent: Wednesday, December 2, 2020 9:00 AM
>>>>
>>>> On Tue, Dec 1, 2020 at 11:59 PM Parav Pandit <parav@nvidia.com> wrote:
>>>>>
>>>>>> From: Yongji Xie <xieyongji@bytedance.com>
>>>>>> Sent: Tuesday, December 1, 2020 7:49 PM
>>>>>>
>>>>>> On Tue, Dec 1, 2020 at 7:32 PM Parav Pandit <parav@nvidia.com>
>> wrote:
>>>>>>>
>>>>>>>> From: Yongji Xie <xieyongji@bytedance.com>
>>>>>>>> Sent: Tuesday, December 1, 2020 3:26 PM
>>>>>>>>
>>>>>>>> On Tue, Dec 1, 2020 at 2:25 PM Jason Wang
>> <jasowang@redhat.com>
>>>>>> wrote:
>>>>>>>>> On 2020/11/30 下午3:07, Yongji Xie wrote:
>>>>>>>>>>>> Thanks for adding me, Jason!
>>>>>>>>>>>>
>>>>>>>>>>>> Now I'm working on a v2 patchset for VDUSE (vDPA Device in
>>>>>>>>>>>> Userspace) [1]. This tool is very useful for the vduse device.
>>>>>>>>>>>> So I'm considering integrating this into my v2 patchset.
>>>>>>>>>>>> But there is one problem:
>>>>>>>>>>>>
>>>>>>>>>>>> In this tool, vdpa device config action and enable action are
>>>>>>>>>>>> combined into one netlink msg: VDPA_CMD_DEV_NEW. But in
>>>>>> vduse
>>>>>>>>>>>> case, it needs to be splitted because a chardev should be
>>>>>>>>>>>> created and opened by a userspace process before we enable
>>>>>>>>>>>> the vdpa device (call vdpa_register_device()).
>>>>>>>>>>>>
>>>>>>>>>>>> So I'd like to know whether it's possible (or have some
>>>>>>>>>>>> plans) to add two new netlink msgs something like:
>>>>>>>>>>>> VDPA_CMD_DEV_ENABLE
>>>>>>>> and
>>>>>>>>>>>> VDPA_CMD_DEV_DISABLE to make the config path more
>> flexible.
>>>>>>>>>>> Actually, we've discussed such intermediate step in some early
>>>>>>>>>>> discussion. It looks to me VDUSE could be one of the users of
>>>> this.
>>>>>>>>>>> Or I wonder whether we can switch to use anonymous
>>>>>>>>>>> inode(fd) for VDUSE then fetching it via an
>>>>>>>>>>> VDUSE_GET_DEVICE_FD
>>>> ioctl?
>>>>>>>>>> Yes, we can. Actually the current implementation in VDUSE is
>>>>>>>>>> like this.  But seems like this is still a intermediate step.
>>>>>>>>>> The fd should be binded to a name or something else which need
>>>>>>>>>> to be configured before.
>>>>>>>>> The name could be specified via the netlink. It looks to me the
>>>>>>>>> real issue is that until the device is connected with a
>>>>>>>>> userspace, it can't be used. So we also need to fail the
>>>>>>>>> enabling if it doesn't
>>>>>> opened.
>>>>>>>> Yes, that's true. So you mean we can firstly try to fetch the fd
>>>>>>>> binded to a name/vduse_id via an VDUSE_GET_DEVICE_FD, then
>> use
>>>>>>>> the name/vduse_id as a attribute to create vdpa device? It looks
>>>>>>>> fine to
>>>> me.
>>>>>>> I probably do not well understand. I tried reading patch [1] and
>>>>>>> few things
>>>>>> do not look correct as below.
>>>>>>> Creating the vdpa device on the bus device and destroying the
>>>>>>> device from
>>>>>> the workqueue seems unnecessary and racy.
>>>>>>> It seems vduse driver needs
>>>>>>> This is something should be done as part of the vdpa dev add
>>>>>>> command,
>>>>>> instead of connecting two sides separately and ensuring race free
>>>>>> access to it.
>>>>>>> So VDUSE_DEV_START and VDUSE_DEV_STOP should possibly be
>> avoided.
>>>>>> Yes, we can avoid these two ioctls with the help of the management
>> tool.
>>>>>>> $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
>>>>>>>
>>>>>>> When above command is executed it creates necessary vdpa device
>>>>>>> foo2
>>>>>> on the bus.
>>>>>>> When user binds foo2 device with the vduse driver, in the probe(),
>>>>>>> it
>>>>>> creates respective char device to access it from user space.
>>>>>>
>>>>> I see. So vduse cannot work with any existing vdpa devices like ifc,
>>>>> mlx5 or
>>>> netdevsim.
>>>>> It has its own implementation similar to fuse with its own backend of
>> choice.
>>>>> More below.
>>>>>
>>>>>> But vduse driver is not a vdpa bus driver. It works like vdpasim
>>>>>> driver, but offloads the data plane and control plane to a user space
>> process.
>>>>> In that case to draw parallel lines,
>>>>>
>>>>> 1. netdevsim:
>>>>> (a) create resources in kernel sw
>>>>> (b) datapath simulates in kernel
>>>>>
>>>>> 2. ifc + mlx5 vdpa dev:
>>>>> (a) creates resource in hw
>>>>> (b) data path is in hw
>>>>>
>>>>> 3. vduse:
>>>>> (a) creates resources in userspace sw
>>>>> (b) data path is in user space.
>>>>> hence creates data path resources for user space.
>>>>> So char device is created, removed as result of vdpa device creation.
>>>>>
>>>>> For example,
>>>>> $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
>>>>>
>>>>> Above command will create char device for user space.
>>>>>
>>>>> Similar command for ifc/mlx5 would have created similar channel for
>>>>> rest of
>>>> the config commands in hw.
>>>>> vduse channel = char device, eventfd etc.
>>>>> ifc/mlx5 hw channel = bar, irq, command interface etc Netdev sim
>>>>> channel = sw direct calls
>>>>>
>>>>> Does it make sense?
>>>> In my understanding, to make vdpa work, we need a backend (datapath
>>>> resources) and a frontend (a vdpa device attached to a vdpa bus). In
>>>> the above example, it looks like we use the command "vdpa dev add ..."
>>>>    to create a backend, so do we need another command to create a
>> frontend?
>>> For block device there is certainly some backend to process the IOs.
>>> Sometimes backend to be setup first, before its front end is exposed.
>>> "vdpa dev add" is the front end command who connects to the backend
>> (implicitly) for network device.
>>> vhost->vdpa_block_device->backend_io_processor (usr,hw,kernel).
>>>
>>> And it needs a way to connect to backend when explicitly specified during
>> creation time.
>>> Something like,
>>> $ vdpa dev add parentdev vdpa_vduse type block name foo3 handle
>> <uuid>
>>> In above example some vendor device specific unique handle is passed
>> based on backend setup in hardware/user space.
>>> In below 3 examples, vdpa block simulator is connecting to backend block
>> or file.
>>> $ vdpa dev add parentdev vdpa_blocksim type block name foo4 blockdev
>>> /dev/zero
>>>
>>> $ vdpa dev add parentdev vdpa_blocksim type block name foo5 blockdev
>>> /dev/sda2 size=100M offset=10M
>>>
>>> $ vdpa dev add parentdev vdpa_block filebackend_sim type block name
>>> foo6 file /root/file_backend.txt
>>>
>>> Or may be backend connects to the created vdpa device is bound to the
>> driver.
>>> Can vduse attach to the created vdpa block device through the char device
>> and establish the channel to receive IOs, and to setup the block config space?
>>
>>
>> I think it can work.
>>
>> Another thing I wonder it that, do we consider more than one VDUSE
>> parentdev(or management dev)? This allows us to have separated devices
>> implemented via different processes.
> Multiple parentdev should be possible per one driver. for example mlx5_vdpa.ko will create multiple parent dev, one for each PCI VFs, SFs.
> vdpa dev add can certainly use one parent/mgmt dev to create multiple vdpa devices.
> Not sure why do we need to create multiple parent dev for that.
> I guess there is just one parent/mgmt. dev for VDUSE. What will each mgmtdev do differently?
> Demux of IOs, events will be per individual char dev level?


It could be something like how it works for different hardware vendors. 
E.g IFCVF and mlx5 will register different parentdevs. For userspace, we 
need to allow different software vendors to manage their instances 
individually.

Thanks


>
>> If yes, VDUSE ioctl needs to be extended to register/unregister parentdev.
>>
>> Thanks
>>
>>
>>>> Thanks,
>>>> Yongji


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-02  4:53                     ` Parav Pandit
  2020-12-02  5:51                       ` Jason Wang
@ 2020-12-02  9:21                       ` Yongji Xie
  2020-12-02 11:13                         ` Parav Pandit
  1 sibling, 1 reply; 79+ messages in thread
From: Yongji Xie @ 2020-12-02  9:21 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtualization, Michael S. Tsirkin, Eli Cohen, netdev

On Wed, Dec 2, 2020 at 12:53 PM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Yongji Xie <xieyongji@bytedance.com>
> > Sent: Wednesday, December 2, 2020 9:00 AM
> >
> > On Tue, Dec 1, 2020 at 11:59 PM Parav Pandit <parav@nvidia.com> wrote:
> > >
> > >
> > >
> > > > From: Yongji Xie <xieyongji@bytedance.com>
> > > > Sent: Tuesday, December 1, 2020 7:49 PM
> > > >
> > > > On Tue, Dec 1, 2020 at 7:32 PM Parav Pandit <parav@nvidia.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > > From: Yongji Xie <xieyongji@bytedance.com>
> > > > > > Sent: Tuesday, December 1, 2020 3:26 PM
> > > > > >
> > > > > > On Tue, Dec 1, 2020 at 2:25 PM Jason Wang <jasowang@redhat.com>
> > > > wrote:
> > > > > > >
> > > > > > >
> > > > > > > On 2020/11/30 下午3:07, Yongji Xie wrote:
> > > > > > > >>> Thanks for adding me, Jason!
> > > > > > > >>>
> > > > > > > >>> Now I'm working on a v2 patchset for VDUSE (vDPA Device in
> > > > > > > >>> Userspace) [1]. This tool is very useful for the vduse device.
> > > > > > > >>> So I'm considering integrating this into my v2 patchset.
> > > > > > > >>> But there is one problem:
> > > > > > > >>>
> > > > > > > >>> In this tool, vdpa device config action and enable action
> > > > > > > >>> are combined into one netlink msg: VDPA_CMD_DEV_NEW. But
> > > > > > > >>> in
> > > > vduse
> > > > > > > >>> case, it needs to be splitted because a chardev should be
> > > > > > > >>> created and opened by a userspace process before we enable
> > > > > > > >>> the vdpa device (call vdpa_register_device()).
> > > > > > > >>>
> > > > > > > >>> So I'd like to know whether it's possible (or have some
> > > > > > > >>> plans) to add two new netlink msgs something like:
> > > > > > > >>> VDPA_CMD_DEV_ENABLE
> > > > > > and
> > > > > > > >>> VDPA_CMD_DEV_DISABLE to make the config path more flexible.
> > > > > > > >>>
> > > > > > > >> Actually, we've discussed such intermediate step in some
> > > > > > > >> early discussion. It looks to me VDUSE could be one of the users of
> > this.
> > > > > > > >>
> > > > > > > >> Or I wonder whether we can switch to use anonymous
> > > > > > > >> inode(fd) for VDUSE then fetching it via an VDUSE_GET_DEVICE_FD
> > ioctl?
> > > > > > > >>
> > > > > > > > Yes, we can. Actually the current implementation in VDUSE is
> > > > > > > > like this.  But seems like this is still a intermediate step.
> > > > > > > > The fd should be binded to a name or something else which
> > > > > > > > need to be configured before.
> > > > > > >
> > > > > > >
> > > > > > > The name could be specified via the netlink. It looks to me
> > > > > > > the real issue is that until the device is connected with a
> > > > > > > userspace, it can't be used. So we also need to fail the
> > > > > > > enabling if it doesn't
> > > > opened.
> > > > > > >
> > > > > >
> > > > > > Yes, that's true. So you mean we can firstly try to fetch the fd
> > > > > > binded to a name/vduse_id via an VDUSE_GET_DEVICE_FD, then use
> > > > > > the name/vduse_id as a attribute to create vdpa device? It looks fine to
> > me.
> > > > >
> > > > > I probably do not well understand. I tried reading patch [1] and
> > > > > few things
> > > > do not look correct as below.
> > > > > Creating the vdpa device on the bus device and destroying the
> > > > > device from
> > > > the workqueue seems unnecessary and racy.
> > > > >
> > > > > It seems vduse driver needs
> > > > > This is something should be done as part of the vdpa dev add
> > > > > command,
> > > > instead of connecting two sides separately and ensuring race free
> > > > access to it.
> > > > >
> > > > > So VDUSE_DEV_START and VDUSE_DEV_STOP should possibly be avoided.
> > > > >
> > > >
> > > > Yes, we can avoid these two ioctls with the help of the management tool.
> > > >
> > > > > $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> > > > >
> > > > > When above command is executed it creates necessary vdpa device
> > > > > foo2
> > > > on the bus.
> > > > > When user binds foo2 device with the vduse driver, in the probe(),
> > > > > it
> > > > creates respective char device to access it from user space.
> > > >
> > > I see. So vduse cannot work with any existing vdpa devices like ifc, mlx5 or
> > netdevsim.
> > > It has its own implementation similar to fuse with its own backend of choice.
> > > More below.
> > >
> > > > But vduse driver is not a vdpa bus driver. It works like vdpasim
> > > > driver, but offloads the data plane and control plane to a user space process.
> > >
> > > In that case to draw parallel lines,
> > >
> > > 1. netdevsim:
> > > (a) create resources in kernel sw
> > > (b) datapath simulates in kernel
> > >
> > > 2. ifc + mlx5 vdpa dev:
> > > (a) creates resource in hw
> > > (b) data path is in hw
> > >
> > > 3. vduse:
> > > (a) creates resources in userspace sw
> > > (b) data path is in user space.
> > > hence creates data path resources for user space.
> > > So char device is created, removed as result of vdpa device creation.
> > >
> > > For example,
> > > $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> > >
> > > Above command will create char device for user space.
> > >
> > > Similar command for ifc/mlx5 would have created similar channel for rest of
> > the config commands in hw.
> > > vduse channel = char device, eventfd etc.
> > > ifc/mlx5 hw channel = bar, irq, command interface etc Netdev sim
> > > channel = sw direct calls
> > >
> > > Does it make sense?
> >
> > In my understanding, to make vdpa work, we need a backend (datapath
> > resources) and a frontend (a vdpa device attached to a vdpa bus). In the above
> > example, it looks like we use the command "vdpa dev add ..."
> >  to create a backend, so do we need another command to create a frontend?
> >
> For block device there is certainly some backend to process the IOs.
> Sometimes backend to be setup first, before its front end is exposed.

Yes, the backend need to be setup firstly, this is vendor device
specific, not vdpa specific.

> "vdpa dev add" is the front end command who connects to the backend (implicitly) for network device.
>
> vhost->vdpa_block_device->backend_io_processor (usr,hw,kernel).
>
> And it needs a way to connect to backend when explicitly specified during creation time.
> Something like,
> $ vdpa dev add parentdev vdpa_vduse type block name foo3 handle <uuid>
> In above example some vendor device specific unique handle is passed based on backend setup in hardware/user space.
>

Yes, we can work like this. After we setup a backend through an
anonymous inode(fd) from /dev/vduse, we can get a unique handle. Then
use it to create a frontend which will connect to the specific
backend.

> In below 3 examples, vdpa block simulator is connecting to backend block or file.
>
> $ vdpa dev add parentdev vdpa_blocksim type block name foo4 blockdev /dev/zero
>
> $ vdpa dev add parentdev vdpa_blocksim type block name foo5 blockdev /dev/sda2 size=100M offset=10M
>
> $ vdpa dev add parentdev vdpa_block filebackend_sim type block name foo6 file /root/file_backend.txt
>
> Or may be backend connects to the created vdpa device is bound to the driver.
> Can vduse attach to the created vdpa block device through the char device and establish the channel to receive IOs, and to setup the block config space?
>

How to create the vdpa block device? If we use the command "vdpa dev
add..", the command will hang there until a vduse process attaches to
the vdpa block device.

Thanks,
Yongji

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-02  5:51                       ` Jason Wang
  2020-12-02  6:24                         ` Parav Pandit
@ 2020-12-02  9:27                         ` Yongji Xie
  1 sibling, 0 replies; 79+ messages in thread
From: Yongji Xie @ 2020-12-02  9:27 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtualization, Michael S. Tsirkin, Eli Cohen, netdev

On Wed, Dec 2, 2020 at 1:51 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2020/12/2 下午12:53, Parav Pandit wrote:
> >
> >> From: Yongji Xie <xieyongji@bytedance.com>
> >> Sent: Wednesday, December 2, 2020 9:00 AM
> >>
> >> On Tue, Dec 1, 2020 at 11:59 PM Parav Pandit <parav@nvidia.com> wrote:
> >>>
> >>>
> >>>> From: Yongji Xie <xieyongji@bytedance.com>
> >>>> Sent: Tuesday, December 1, 2020 7:49 PM
> >>>>
> >>>> On Tue, Dec 1, 2020 at 7:32 PM Parav Pandit <parav@nvidia.com> wrote:
> >>>>>
> >>>>>
> >>>>>> From: Yongji Xie <xieyongji@bytedance.com>
> >>>>>> Sent: Tuesday, December 1, 2020 3:26 PM
> >>>>>>
> >>>>>> On Tue, Dec 1, 2020 at 2:25 PM Jason Wang <jasowang@redhat.com>
> >>>> wrote:
> >>>>>>>
> >>>>>>> On 2020/11/30 下午3:07, Yongji Xie wrote:
> >>>>>>>>>> Thanks for adding me, Jason!
> >>>>>>>>>>
> >>>>>>>>>> Now I'm working on a v2 patchset for VDUSE (vDPA Device in
> >>>>>>>>>> Userspace) [1]. This tool is very useful for the vduse device.
> >>>>>>>>>> So I'm considering integrating this into my v2 patchset.
> >>>>>>>>>> But there is one problem:
> >>>>>>>>>>
> >>>>>>>>>> In this tool, vdpa device config action and enable action
> >>>>>>>>>> are combined into one netlink msg: VDPA_CMD_DEV_NEW. But
> >>>>>>>>>> in
> >>>> vduse
> >>>>>>>>>> case, it needs to be splitted because a chardev should be
> >>>>>>>>>> created and opened by a userspace process before we enable
> >>>>>>>>>> the vdpa device (call vdpa_register_device()).
> >>>>>>>>>>
> >>>>>>>>>> So I'd like to know whether it's possible (or have some
> >>>>>>>>>> plans) to add two new netlink msgs something like:
> >>>>>>>>>> VDPA_CMD_DEV_ENABLE
> >>>>>> and
> >>>>>>>>>> VDPA_CMD_DEV_DISABLE to make the config path more flexible.
> >>>>>>>>>>
> >>>>>>>>> Actually, we've discussed such intermediate step in some
> >>>>>>>>> early discussion. It looks to me VDUSE could be one of the users of
> >> this.
> >>>>>>>>> Or I wonder whether we can switch to use anonymous
> >>>>>>>>> inode(fd) for VDUSE then fetching it via an VDUSE_GET_DEVICE_FD
> >> ioctl?
> >>>>>>>> Yes, we can. Actually the current implementation in VDUSE is
> >>>>>>>> like this.  But seems like this is still a intermediate step.
> >>>>>>>> The fd should be binded to a name or something else which
> >>>>>>>> need to be configured before.
> >>>>>>>
> >>>>>>> The name could be specified via the netlink. It looks to me
> >>>>>>> the real issue is that until the device is connected with a
> >>>>>>> userspace, it can't be used. So we also need to fail the
> >>>>>>> enabling if it doesn't
> >>>> opened.
> >>>>>> Yes, that's true. So you mean we can firstly try to fetch the fd
> >>>>>> binded to a name/vduse_id via an VDUSE_GET_DEVICE_FD, then use
> >>>>>> the name/vduse_id as a attribute to create vdpa device? It looks fine to
> >> me.
> >>>>> I probably do not well understand. I tried reading patch [1] and
> >>>>> few things
> >>>> do not look correct as below.
> >>>>> Creating the vdpa device on the bus device and destroying the
> >>>>> device from
> >>>> the workqueue seems unnecessary and racy.
> >>>>> It seems vduse driver needs
> >>>>> This is something should be done as part of the vdpa dev add
> >>>>> command,
> >>>> instead of connecting two sides separately and ensuring race free
> >>>> access to it.
> >>>>> So VDUSE_DEV_START and VDUSE_DEV_STOP should possibly be avoided.
> >>>>>
> >>>> Yes, we can avoid these two ioctls with the help of the management tool.
> >>>>
> >>>>> $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> >>>>>
> >>>>> When above command is executed it creates necessary vdpa device
> >>>>> foo2
> >>>> on the bus.
> >>>>> When user binds foo2 device with the vduse driver, in the probe(),
> >>>>> it
> >>>> creates respective char device to access it from user space.
> >>>>
> >>> I see. So vduse cannot work with any existing vdpa devices like ifc, mlx5 or
> >> netdevsim.
> >>> It has its own implementation similar to fuse with its own backend of choice.
> >>> More below.
> >>>
> >>>> But vduse driver is not a vdpa bus driver. It works like vdpasim
> >>>> driver, but offloads the data plane and control plane to a user space process.
> >>> In that case to draw parallel lines,
> >>>
> >>> 1. netdevsim:
> >>> (a) create resources in kernel sw
> >>> (b) datapath simulates in kernel
> >>>
> >>> 2. ifc + mlx5 vdpa dev:
> >>> (a) creates resource in hw
> >>> (b) data path is in hw
> >>>
> >>> 3. vduse:
> >>> (a) creates resources in userspace sw
> >>> (b) data path is in user space.
> >>> hence creates data path resources for user space.
> >>> So char device is created, removed as result of vdpa device creation.
> >>>
> >>> For example,
> >>> $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> >>>
> >>> Above command will create char device for user space.
> >>>
> >>> Similar command for ifc/mlx5 would have created similar channel for rest of
> >> the config commands in hw.
> >>> vduse channel = char device, eventfd etc.
> >>> ifc/mlx5 hw channel = bar, irq, command interface etc Netdev sim
> >>> channel = sw direct calls
> >>>
> >>> Does it make sense?
> >> In my understanding, to make vdpa work, we need a backend (datapath
> >> resources) and a frontend (a vdpa device attached to a vdpa bus). In the above
> >> example, it looks like we use the command "vdpa dev add ..."
> >>   to create a backend, so do we need another command to create a frontend?
> >>
> > For block device there is certainly some backend to process the IOs.
> > Sometimes backend to be setup first, before its front end is exposed.
> > "vdpa dev add" is the front end command who connects to the backend (implicitly) for network device.
> >
> > vhost->vdpa_block_device->backend_io_processor (usr,hw,kernel).
> >
> > And it needs a way to connect to backend when explicitly specified during creation time.
> > Something like,
> > $ vdpa dev add parentdev vdpa_vduse type block name foo3 handle <uuid>
> > In above example some vendor device specific unique handle is passed based on backend setup in hardware/user space.
> >
> > In below 3 examples, vdpa block simulator is connecting to backend block or file.
> >
> > $ vdpa dev add parentdev vdpa_blocksim type block name foo4 blockdev /dev/zero
> >
> > $ vdpa dev add parentdev vdpa_blocksim type block name foo5 blockdev /dev/sda2 size=100M offset=10M
> >
> > $ vdpa dev add parentdev vdpa_block filebackend_sim type block name foo6 file /root/file_backend.txt
> >
> > Or may be backend connects to the created vdpa device is bound to the driver.
> > Can vduse attach to the created vdpa block device through the char device and establish the channel to receive IOs, and to setup the block config space?
>
>
> I think it can work.
>
> Another thing I wonder it that, do we consider more than one VDUSE
> parentdev(or management dev)? This allows us to have separated devices
> implemented via different processes.
>
> If yes, VDUSE ioctl needs to be extended to register/unregister parentdev.
>

Yes, we need to extend the ioctl to support that. Now we only have one
parentdev represented by /dev/vduse.

Thanks,
Yongji

^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-02  9:21                       ` Yongji Xie
@ 2020-12-02 11:13                         ` Parav Pandit
  2020-12-02 13:18                           ` Yongji Xie
  0 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2020-12-02 11:13 UTC (permalink / raw)
  To: Yongji Xie
  Cc: Jason Wang, virtualization, Michael S. Tsirkin, Eli Cohen, netdev



> From: Yongji Xie <xieyongji@bytedance.com>
> Sent: Wednesday, December 2, 2020 2:52 PM
> 
> On Wed, Dec 2, 2020 at 12:53 PM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> >
> > > From: Yongji Xie <xieyongji@bytedance.com>
> > > Sent: Wednesday, December 2, 2020 9:00 AM
> > >
> > > On Tue, Dec 1, 2020 at 11:59 PM Parav Pandit <parav@nvidia.com> wrote:
> > > >
> > > >
> > > >
> > > > > From: Yongji Xie <xieyongji@bytedance.com>
> > > > > Sent: Tuesday, December 1, 2020 7:49 PM
> > > > >
> > > > > On Tue, Dec 1, 2020 at 7:32 PM Parav Pandit <parav@nvidia.com>
> wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > > From: Yongji Xie <xieyongji@bytedance.com>
> > > > > > > Sent: Tuesday, December 1, 2020 3:26 PM
> > > > > > >
> > > > > > > On Tue, Dec 1, 2020 at 2:25 PM Jason Wang
> > > > > > > <jasowang@redhat.com>
> > > > > wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > On 2020/11/30 下午3:07, Yongji Xie wrote:
> > > > > > > > >>> Thanks for adding me, Jason!
> > > > > > > > >>>
> > > > > > > > >>> Now I'm working on a v2 patchset for VDUSE (vDPA
> > > > > > > > >>> Device in
> > > > > > > > >>> Userspace) [1]. This tool is very useful for the vduse device.
> > > > > > > > >>> So I'm considering integrating this into my v2 patchset.
> > > > > > > > >>> But there is one problem:
> > > > > > > > >>>
> > > > > > > > >>> In this tool, vdpa device config action and enable
> > > > > > > > >>> action are combined into one netlink msg:
> > > > > > > > >>> VDPA_CMD_DEV_NEW. But in
> > > > > vduse
> > > > > > > > >>> case, it needs to be splitted because a chardev should
> > > > > > > > >>> be created and opened by a userspace process before we
> > > > > > > > >>> enable the vdpa device (call vdpa_register_device()).
> > > > > > > > >>>
> > > > > > > > >>> So I'd like to know whether it's possible (or have
> > > > > > > > >>> some
> > > > > > > > >>> plans) to add two new netlink msgs something like:
> > > > > > > > >>> VDPA_CMD_DEV_ENABLE
> > > > > > > and
> > > > > > > > >>> VDPA_CMD_DEV_DISABLE to make the config path more
> flexible.
> > > > > > > > >>>
> > > > > > > > >> Actually, we've discussed such intermediate step in
> > > > > > > > >> some early discussion. It looks to me VDUSE could be
> > > > > > > > >> one of the users of
> > > this.
> > > > > > > > >>
> > > > > > > > >> Or I wonder whether we can switch to use anonymous
> > > > > > > > >> inode(fd) for VDUSE then fetching it via an
> > > > > > > > >> VDUSE_GET_DEVICE_FD
> > > ioctl?
> > > > > > > > >>
> > > > > > > > > Yes, we can. Actually the current implementation in
> > > > > > > > > VDUSE is like this.  But seems like this is still a intermediate
> step.
> > > > > > > > > The fd should be binded to a name or something else
> > > > > > > > > which need to be configured before.
> > > > > > > >
> > > > > > > >
> > > > > > > > The name could be specified via the netlink. It looks to
> > > > > > > > me the real issue is that until the device is connected
> > > > > > > > with a userspace, it can't be used. So we also need to
> > > > > > > > fail the enabling if it doesn't
> > > > > opened.
> > > > > > > >
> > > > > > >
> > > > > > > Yes, that's true. So you mean we can firstly try to fetch
> > > > > > > the fd binded to a name/vduse_id via an VDUSE_GET_DEVICE_FD,
> > > > > > > then use the name/vduse_id as a attribute to create vdpa
> > > > > > > device? It looks fine to
> > > me.
> > > > > >
> > > > > > I probably do not well understand. I tried reading patch [1]
> > > > > > and few things
> > > > > do not look correct as below.
> > > > > > Creating the vdpa device on the bus device and destroying the
> > > > > > device from
> > > > > the workqueue seems unnecessary and racy.
> > > > > >
> > > > > > It seems vduse driver needs
> > > > > > This is something should be done as part of the vdpa dev add
> > > > > > command,
> > > > > instead of connecting two sides separately and ensuring race
> > > > > free access to it.
> > > > > >
> > > > > > So VDUSE_DEV_START and VDUSE_DEV_STOP should possibly be
> avoided.
> > > > > >
> > > > >
> > > > > Yes, we can avoid these two ioctls with the help of the management
> tool.
> > > > >
> > > > > > $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> > > > > >
> > > > > > When above command is executed it creates necessary vdpa
> > > > > > device
> > > > > > foo2
> > > > > on the bus.
> > > > > > When user binds foo2 device with the vduse driver, in the
> > > > > > probe(), it
> > > > > creates respective char device to access it from user space.
> > > > >
> > > > I see. So vduse cannot work with any existing vdpa devices like
> > > > ifc, mlx5 or
> > > netdevsim.
> > > > It has its own implementation similar to fuse with its own backend of
> choice.
> > > > More below.
> > > >
> > > > > But vduse driver is not a vdpa bus driver. It works like vdpasim
> > > > > driver, but offloads the data plane and control plane to a user space
> process.
> > > >
> > > > In that case to draw parallel lines,
> > > >
> > > > 1. netdevsim:
> > > > (a) create resources in kernel sw
> > > > (b) datapath simulates in kernel
> > > >
> > > > 2. ifc + mlx5 vdpa dev:
> > > > (a) creates resource in hw
> > > > (b) data path is in hw
> > > >
> > > > 3. vduse:
> > > > (a) creates resources in userspace sw
> > > > (b) data path is in user space.
> > > > hence creates data path resources for user space.
> > > > So char device is created, removed as result of vdpa device creation.
> > > >
> > > > For example,
> > > > $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> > > >
> > > > Above command will create char device for user space.
> > > >
> > > > Similar command for ifc/mlx5 would have created similar channel
> > > > for rest of
> > > the config commands in hw.
> > > > vduse channel = char device, eventfd etc.
> > > > ifc/mlx5 hw channel = bar, irq, command interface etc Netdev sim
> > > > channel = sw direct calls
> > > >
> > > > Does it make sense?
> > >
> > > In my understanding, to make vdpa work, we need a backend (datapath
> > > resources) and a frontend (a vdpa device attached to a vdpa bus). In
> > > the above example, it looks like we use the command "vdpa dev add ..."
> > >  to create a backend, so do we need another command to create a
> frontend?
> > >
> > For block device there is certainly some backend to process the IOs.
> > Sometimes backend to be setup first, before its front end is exposed.
> 
> Yes, the backend need to be setup firstly, this is vendor device specific, not
> vdpa specific.
> 
> > "vdpa dev add" is the front end command who connects to the backend
> (implicitly) for network device.
> >
> > vhost->vdpa_block_device->backend_io_processor (usr,hw,kernel).
> >
> > And it needs a way to connect to backend when explicitly specified during
> creation time.
> > Something like,
> > $ vdpa dev add parentdev vdpa_vduse type block name foo3 handle
> <uuid>
> > In above example some vendor device specific unique handle is passed
> based on backend setup in hardware/user space.
> >
> 
> Yes, we can work like this. After we setup a backend through an anonymous
> inode(fd) from /dev/vduse, we can get a unique handle. Then use it to
> create a frontend which will connect to the specific backend.

I do not fully understand the inode. But I assume this is some unique handle say uuid or something that both sides backend and vdpa device understand.
It cannot be some kernel internal handle expose to user space.

> 
> > In below 3 examples, vdpa block simulator is connecting to backend block
> or file.
> >
> > $ vdpa dev add parentdev vdpa_blocksim type block name foo4 blockdev
> > /dev/zero
> >
> > $ vdpa dev add parentdev vdpa_blocksim type block name foo5 blockdev
> > /dev/sda2 size=100M offset=10M
> >
> > $ vdpa dev add parentdev vdpa_block filebackend_sim type block name
> > foo6 file /root/file_backend.txt
> >
> > Or may be backend connects to the created vdpa device is bound to the
> driver.
> > Can vduse attach to the created vdpa block device through the char device
> and establish the channel to receive IOs, and to setup the block config space?
> >
> 
> How to create the vdpa block device? If we use the command "vdpa dev
> add..", the command will hang there until a vduse process attaches to the
> vdpa block device.
I was suggesting that vdpa device is created, but it doesn’t have backend attached to it.
It is attached to the backend when ioctl() side does enough setup. This state is handled internally the vduse driver.

But the above method of preparing backend looks more sane.

Regardless of which method is preferred, vduse driver must need a state to detach the vdpa bus device queues etc from the user space.
This is needed because user space process can terminate anytime resulting in detaching dpa bus device in_use by the vhost side.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-02 11:13                         ` Parav Pandit
@ 2020-12-02 13:18                           ` Yongji Xie
  0 siblings, 0 replies; 79+ messages in thread
From: Yongji Xie @ 2020-12-02 13:18 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtualization, Michael S. Tsirkin, Eli Cohen, netdev

On Wed, Dec 2, 2020 at 7:13 PM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Yongji Xie <xieyongji@bytedance.com>
> > Sent: Wednesday, December 2, 2020 2:52 PM
> >
> > On Wed, Dec 2, 2020 at 12:53 PM Parav Pandit <parav@nvidia.com> wrote:
> > >
> > >
> > >
> > > > From: Yongji Xie <xieyongji@bytedance.com>
> > > > Sent: Wednesday, December 2, 2020 9:00 AM
> > > >
> > > > On Tue, Dec 1, 2020 at 11:59 PM Parav Pandit <parav@nvidia.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > > From: Yongji Xie <xieyongji@bytedance.com>
> > > > > > Sent: Tuesday, December 1, 2020 7:49 PM
> > > > > >
> > > > > > On Tue, Dec 1, 2020 at 7:32 PM Parav Pandit <parav@nvidia.com>
> > wrote:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > From: Yongji Xie <xieyongji@bytedance.com>
> > > > > > > > Sent: Tuesday, December 1, 2020 3:26 PM
> > > > > > > >
> > > > > > > > On Tue, Dec 1, 2020 at 2:25 PM Jason Wang
> > > > > > > > <jasowang@redhat.com>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On 2020/11/30 下午3:07, Yongji Xie wrote:
> > > > > > > > > >>> Thanks for adding me, Jason!
> > > > > > > > > >>>
> > > > > > > > > >>> Now I'm working on a v2 patchset for VDUSE (vDPA
> > > > > > > > > >>> Device in
> > > > > > > > > >>> Userspace) [1]. This tool is very useful for the vduse device.
> > > > > > > > > >>> So I'm considering integrating this into my v2 patchset.
> > > > > > > > > >>> But there is one problem:
> > > > > > > > > >>>
> > > > > > > > > >>> In this tool, vdpa device config action and enable
> > > > > > > > > >>> action are combined into one netlink msg:
> > > > > > > > > >>> VDPA_CMD_DEV_NEW. But in
> > > > > > vduse
> > > > > > > > > >>> case, it needs to be splitted because a chardev should
> > > > > > > > > >>> be created and opened by a userspace process before we
> > > > > > > > > >>> enable the vdpa device (call vdpa_register_device()).
> > > > > > > > > >>>
> > > > > > > > > >>> So I'd like to know whether it's possible (or have
> > > > > > > > > >>> some
> > > > > > > > > >>> plans) to add two new netlink msgs something like:
> > > > > > > > > >>> VDPA_CMD_DEV_ENABLE
> > > > > > > > and
> > > > > > > > > >>> VDPA_CMD_DEV_DISABLE to make the config path more
> > flexible.
> > > > > > > > > >>>
> > > > > > > > > >> Actually, we've discussed such intermediate step in
> > > > > > > > > >> some early discussion. It looks to me VDUSE could be
> > > > > > > > > >> one of the users of
> > > > this.
> > > > > > > > > >>
> > > > > > > > > >> Or I wonder whether we can switch to use anonymous
> > > > > > > > > >> inode(fd) for VDUSE then fetching it via an
> > > > > > > > > >> VDUSE_GET_DEVICE_FD
> > > > ioctl?
> > > > > > > > > >>
> > > > > > > > > > Yes, we can. Actually the current implementation in
> > > > > > > > > > VDUSE is like this.  But seems like this is still a intermediate
> > step.
> > > > > > > > > > The fd should be binded to a name or something else
> > > > > > > > > > which need to be configured before.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > The name could be specified via the netlink. It looks to
> > > > > > > > > me the real issue is that until the device is connected
> > > > > > > > > with a userspace, it can't be used. So we also need to
> > > > > > > > > fail the enabling if it doesn't
> > > > > > opened.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Yes, that's true. So you mean we can firstly try to fetch
> > > > > > > > the fd binded to a name/vduse_id via an VDUSE_GET_DEVICE_FD,
> > > > > > > > then use the name/vduse_id as a attribute to create vdpa
> > > > > > > > device? It looks fine to
> > > > me.
> > > > > > >
> > > > > > > I probably do not well understand. I tried reading patch [1]
> > > > > > > and few things
> > > > > > do not look correct as below.
> > > > > > > Creating the vdpa device on the bus device and destroying the
> > > > > > > device from
> > > > > > the workqueue seems unnecessary and racy.
> > > > > > >
> > > > > > > It seems vduse driver needs
> > > > > > > This is something should be done as part of the vdpa dev add
> > > > > > > command,
> > > > > > instead of connecting two sides separately and ensuring race
> > > > > > free access to it.
> > > > > > >
> > > > > > > So VDUSE_DEV_START and VDUSE_DEV_STOP should possibly be
> > avoided.
> > > > > > >
> > > > > >
> > > > > > Yes, we can avoid these two ioctls with the help of the management
> > tool.
> > > > > >
> > > > > > > $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> > > > > > >
> > > > > > > When above command is executed it creates necessary vdpa
> > > > > > > device
> > > > > > > foo2
> > > > > > on the bus.
> > > > > > > When user binds foo2 device with the vduse driver, in the
> > > > > > > probe(), it
> > > > > > creates respective char device to access it from user space.
> > > > > >
> > > > > I see. So vduse cannot work with any existing vdpa devices like
> > > > > ifc, mlx5 or
> > > > netdevsim.
> > > > > It has its own implementation similar to fuse with its own backend of
> > choice.
> > > > > More below.
> > > > >
> > > > > > But vduse driver is not a vdpa bus driver. It works like vdpasim
> > > > > > driver, but offloads the data plane and control plane to a user space
> > process.
> > > > >
> > > > > In that case to draw parallel lines,
> > > > >
> > > > > 1. netdevsim:
> > > > > (a) create resources in kernel sw
> > > > > (b) datapath simulates in kernel
> > > > >
> > > > > 2. ifc + mlx5 vdpa dev:
> > > > > (a) creates resource in hw
> > > > > (b) data path is in hw
> > > > >
> > > > > 3. vduse:
> > > > > (a) creates resources in userspace sw
> > > > > (b) data path is in user space.
> > > > > hence creates data path resources for user space.
> > > > > So char device is created, removed as result of vdpa device creation.
> > > > >
> > > > > For example,
> > > > > $ vdpa dev add parentdev vduse_mgmtdev type net name foo2
> > > > >
> > > > > Above command will create char device for user space.
> > > > >
> > > > > Similar command for ifc/mlx5 would have created similar channel
> > > > > for rest of
> > > > the config commands in hw.
> > > > > vduse channel = char device, eventfd etc.
> > > > > ifc/mlx5 hw channel = bar, irq, command interface etc Netdev sim
> > > > > channel = sw direct calls
> > > > >
> > > > > Does it make sense?
> > > >
> > > > In my understanding, to make vdpa work, we need a backend (datapath
> > > > resources) and a frontend (a vdpa device attached to a vdpa bus). In
> > > > the above example, it looks like we use the command "vdpa dev add ..."
> > > >  to create a backend, so do we need another command to create a
> > frontend?
> > > >
> > > For block device there is certainly some backend to process the IOs.
> > > Sometimes backend to be setup first, before its front end is exposed.
> >
> > Yes, the backend need to be setup firstly, this is vendor device specific, not
> > vdpa specific.
> >
> > > "vdpa dev add" is the front end command who connects to the backend
> > (implicitly) for network device.
> > >
> > > vhost->vdpa_block_device->backend_io_processor (usr,hw,kernel).
> > >
> > > And it needs a way to connect to backend when explicitly specified during
> > creation time.
> > > Something like,
> > > $ vdpa dev add parentdev vdpa_vduse type block name foo3 handle
> > <uuid>
> > > In above example some vendor device specific unique handle is passed
> > based on backend setup in hardware/user space.
> > >
> >
> > Yes, we can work like this. After we setup a backend through an anonymous
> > inode(fd) from /dev/vduse, we can get a unique handle. Then use it to
> > create a frontend which will connect to the specific backend.
>
> I do not fully understand the inode. But I assume this is some unique handle say uuid or something that both sides backend and vdpa device understand.
> It cannot be some kernel internal handle expose to user space.
>

Yes, the unique handle should be a user-defined stuff.

> >
> > > In below 3 examples, vdpa block simulator is connecting to backend block
> > or file.
> > >
> > > $ vdpa dev add parentdev vdpa_blocksim type block name foo4 blockdev
> > > /dev/zero
> > >
> > > $ vdpa dev add parentdev vdpa_blocksim type block name foo5 blockdev
> > > /dev/sda2 size=100M offset=10M
> > >
> > > $ vdpa dev add parentdev vdpa_block filebackend_sim type block name
> > > foo6 file /root/file_backend.txt
> > >
> > > Or may be backend connects to the created vdpa device is bound to the
> > driver.
> > > Can vduse attach to the created vdpa block device through the char device
> > and establish the channel to receive IOs, and to setup the block config space?
> > >
> >
> > How to create the vdpa block device? If we use the command "vdpa dev
> > add..", the command will hang there until a vduse process attaches to the
> > vdpa block device.
> I was suggesting that vdpa device is created, but it doesn’t have backend attached to it.
> It is attached to the backend when ioctl() side does enough setup. This state is handled internally the vduse driver.
>
> But the above method of preparing backend looks more sane.
>
> Regardless of which method is preferred, vduse driver must need a state to detach the vdpa bus device queues etc from the user space.
> This is needed because user space process can terminate anytime resulting in detaching dpa bus device in_use by the vhost side.

I think the vdpa device should only be detached by the command "vdpa
dev del...". The vduse driver can support reconnecting when user space
process is terminated.

Thanks,
Yongji

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/7] Introduce vdpa management tool
  2020-11-27  3:53 ` Jason Wang
       [not found]   ` <CACycT3sYScObb9nN3g7L3cesjE7sCZWxZ5_5R1usGU9ePZEeqA@mail.gmail.com>
@ 2020-12-08 22:47   ` David Ahern
  2021-01-19  4:21     ` Parav Pandit
  1 sibling, 1 reply; 79+ messages in thread
From: David Ahern @ 2020-12-08 22:47 UTC (permalink / raw)
  To: Jason Wang, Parav Pandit, virtualization, Stephen Hemminger
  Cc: mst, elic, netdev, 谢永吉

On 11/26/20 8:53 PM, Jason Wang wrote:
> 1. Where does userspace vdpa tool reside which users can use?
> Ans: vdpa tool can possibly reside in iproute2 [1] as it enables user to
> create vdpa net devices.

iproute2 package is fine with us, but there are some expectations:
syntax, command options and documentation need to be consistent with
other iproute2 commands (this thread suggests it will be but just being
clear), and it needs to re-use code as much as possible (e.g., json
functions). If there is overlap with other tools (devlink, dcb, etc),
you should refactor into common code used by all. Petr Machata has done
this quite a bit for dcb and is a good example to follow.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/7] Introduce vdpa management tool
  2020-11-17 19:51   ` Parav Pandit
@ 2020-12-16  9:13     ` Michael S. Tsirkin
       [not found]       ` <20201216080610.08541f44@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
  0 siblings, 1 reply; 79+ messages in thread
From: Michael S. Tsirkin @ 2020-12-16  9:13 UTC (permalink / raw)
  To: Parav Pandit; +Cc: Jakub Kicinski, virtualization, jasowang, Eli Cohen, netdev

On Tue, Nov 17, 2020 at 07:51:56PM +0000, Parav Pandit wrote:
> 
> 
> > From: Jakub Kicinski <kuba@kernel.org>
> > Sent: Tuesday, November 17, 2020 3:53 AM
> > 
> > On Thu, 12 Nov 2020 08:39:58 +0200 Parav Pandit wrote:
> > > FAQs:
> > > -----
> > > 1. Where does userspace vdpa tool reside which users can use?
> > > Ans: vdpa tool can possibly reside in iproute2 [1] as it enables user
> > > to create vdpa net devices.
> > >
> > > 2. Why not create and delete vdpa device using sysfs/configfs?
> > > Ans:
> > 
> > > 3. Why not use ioctl() interface?
> > 
> > Obviously I'm gonna ask you - why can't you use devlink?
> > 
> This was considered.
> However it seems that extending devlink for vdpa specific stats, devices, config sounds overloading devlink beyond its defined scope.

kuba what's your thinking here? Should I merge this as is?

> > > Next steps:
> > > -----------
> > > (a) Post this patchset and iproute2/vdpa inclusion, remaining two
> > > drivers will be coverted to support vdpa tool instead of creating
> > > unmanaged default device on driver load.
> > > (b) More net specific parameters such as mac, mtu will be added.
> > 
> > How does MAC and MTU belong in this new VDPA thing?
> MAC only make sense when user wants to run VF/SF Netdev and vdpa together with different mac address.
> Otherwise existing devlink well defined API to have one MAC per function is fine.
> Same for MTU, if queues of vdpa vs VF/SF Netdev queues wants have different MTU it make sense to add configure per vdpa device.


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/7] Introduce vdpa management tool
  2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
                   ` (9 preceding siblings ...)
  2020-11-27  3:53 ` Jason Wang
@ 2020-12-16  9:16 ` Michael S. Tsirkin
  2021-01-04  3:31 ` [PATCH linux-next v2 " Parav Pandit
  2021-01-05 10:31 ` [PATCH linux-next v3 0/6] Introduce vdpa management tool Parav Pandit
  12 siblings, 0 replies; 79+ messages in thread
From: Michael S. Tsirkin @ 2020-12-16  9:16 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtualization, jasowang, elic, netdev

On Thu, Nov 12, 2020 at 08:39:58AM +0200, Parav Pandit wrote:
> This patchset covers user requirements for managing existing vdpa devices,
> using a tool and its internal design notes for kernel drivers.


I applied bugfix patches 1 and 2.
Others conflict with vdpa sim block support, pls rebase.


> Background and user requirements:
> ----------------------------------
> (1) Currently VDPA device is created by driver when driver is loaded.
> However, user should have a choice when to create or not create a vdpa device
> for the underlying parent device.
> 
> For example, mlx5 PCI VF and subfunction device supports multiple classes of
> device such netdev, vdpa, rdma. Howevever it is not required to always created
> vdpa device for such device.
> 
> (2) In another use case, a device may support creating one or multiple vdpa
> device of same or different class such as net and block.
> Creating vdpa devices at driver load time further limits this use case.
> 
> (3) A user should be able to monitor and query vdpa queue level or device level
> statistics for a given vdpa device.
> 
> (4) A user should be able to query what class of vdpa devices are supported
> by its parent device.
> 
> (5) A user should be able to view supported features and negotiated features
> of the vdpa device.
> 
> (6) A user should be able to create a vdpa device in vendor agnostic manner
> using single tool.
> 
> Hence, it is required to have a tool through which user can create one or more
> vdpa devices from a parent device which addresses above user requirements.
> 
> Example devices:
> ----------------
>  +-----------+ +-----------+ +---------+ +--------+ +-----------+ 
>  |vdpa dev 0 | |vdpa dev 1 | |rdma dev | |netdev  | |vdpa dev 3 |
>  |type=net   | |type=block | |mlx5_0   | |ens3f0  | |type=net   |
>  +----+------+ +-----+-----+ +----+----+ +-----+--+ +----+------+
>       |              |            |            |         |
>       |              |            |            |         |
>  +----+-----+        |       +----+----+       |    +----+----+
>  |  mlx5    +--------+       |mlx5     +-------+    |mlx5     |
>  |pci vf 2  |                |pci vf 4 |            |pci sf 8 |
>  |03:00:2   |                |03:00.4  |            |mlx5_sf.8|
>  +----+-----+                +----+----+            +----+----+
>       |                           |                      |
>       |                      +----+-----+                |
>       +----------------------+mlx5      +----------------+
>                              |pci pf 0  |
>                              |03:00.0   |
>                              +----------+
> 
> vdpa tool:
> ----------
> vdpa tool is a tool to create, delete vdpa devices from a parent device. It is a
> tool that enables user to query statistics, features and may be more attributes
> in future.
> 
> vdpa tool command draft:
> ------------------------
> (a) List parent devices which supports creating vdpa devices.
> It also shows which class types supported by this parent device.
> In below command example two parent devices support vdpa device creation.
> First is PCI VF whose bdf is 03.00:2.
> Second is PCI VF whose name is 03:00.4.
> Third is PCI SF whose name is mlx5_core.sf.8
> 
> $ vdpa parentdev list
> vdpasim
>   supported_classes
>     net
> pci/0000:03.00:3
>   supported_classes
>     net block
> pci/0000:03.00:4
>   supported_classes
>     net block
> auxiliary/mlx5_core.sf.8
>   supported_classes
>     net
> 
> (b) Now add a vdpa device of networking class and show the device.
> $ vdpa dev add parentdev pci/0000:03.00:2 type net name foo0 $ vdpa dev show foo0
> foo0: parentdev pci/0000:03.00:2 type network parentdev vdpasim vendor_id 0 max_vqs 2 max_vq_size 256
> 
> (c) Show features of a vdpa device
> $ vdpa dev features show foo0
> supported
>   iommu platform
>   version 1
> 
> (d) Dump vdpa device statistics
> $ vdpa dev stats show foo0
> kickdoorbells 10
> wqes 100
> 
> (e) Now delete a vdpa device previously created.
> $ vdpa dev del foo0
> 
> vdpa tool support in this patchset:
> -----------------------------------
> vdpa tool is created to create, delete and query vdpa devices.
> examples:
> Show vdpa parent device that supports creating, deleting vdpa devices.
> 
> $ vdpa parentdev show
> vdpasim:
>   supported_classes
>     net
> 
> $ vdpa parentdev show -jp
> {
>     "show": {
>        "vdpasim": {
>           "supported_classes": {
>              "net"
>         }
>     }
> }
> 
> Create a vdpa device of type networking named as "foo2" from the parent device vdpasim:
> 
> $ vdpa dev add parentdev vdpasim type net name foo2
> 
> Show the newly created vdpa device by its name:
> $ vdpa dev show foo2
> foo2: type network parentdev vdpasim vendor_id 0 max_vqs 2 max_vq_size 256
> 
> $ vdpa dev show foo2 -jp
> {
>     "dev": {
>         "foo2": {
>             "type": "network",
>             "parentdev": "vdpasim",
>             "vendor_id": 0,
>             "max_vqs": 2,
>             "max_vq_size": 256
>         }
>     }
> }
> 
> Delete the vdpa device after its use:
> $ vdpa dev del foo2
> 
> vdpa tool support by kernel:
> ----------------------------
> vdpa tool user interface will be supported by existing vdpa kernel framework,
> i.e. drivers/vdpa/vdpa.c It services user command through a netlink interface.
> 
> Each parent device registers supported callback operations with vdpa subsystem
> through which vdpa device(s) can be managed.
> 
> FAQs:
> -----
> 1. Where does userspace vdpa tool reside which users can use?
> Ans: vdpa tool can possibly reside in iproute2 [1] as it enables user to
> create vdpa net devices.
> 
> 2. Why not create and delete vdpa device using sysfs/configfs?
> Ans:
> (a) A device creation may involve passing one or more attributes.
> Passing multiple attributes and returning error code and more verbose
> information for invalid attributes cannot be handled by sysfs/configfs.
> 
> (b) netlink framework is rich that enables user space and kernel driver to
> provide nested attributes.
> 
> (c) Exposing device specific file under sysfs without net namespace
> awareness exposes details to multiple containers. Instead exposing
> attributes via a netlink socket secures the communication channel with kernel.
> 
> (d) netlink socket interface enables to run syscaller kernel tests.
> 
> 3. Why not use ioctl() interface?
> Ans: ioctl() interface replicates the necessary plumbing which already
> exists through netlink socket.
> 
> 4. What happens when one or more user created vdpa devices exist for a
> parent PCI VF or SF and such parent device is removed?
> Ans: All user created vdpa devices are removed that belong to a parent.
> 
> [1] git://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git
> 
> Next steps:
> -----------
> (a) Post this patchset and iproute2/vdpa inclusion, remaining two drivers
> will be coverted to support vdpa tool instead of creating unmanaged default
> device on driver load.
> (b) More net specific parameters such as mac, mtu will be added.
> (c) Features bits get and set interface will be added.
> 
> Parav Pandit (7):
>   vdpa: Add missing comment for virtqueue count
>   vdpa: Use simpler version of ida allocation
>   vdpa: Extend routine to accept vdpa device name
>   vdpa: Define vdpa parent device, ops and a netlink interface
>   vdpa: Enable a user to add and delete a vdpa device
>   vdpa: Enable user to query vdpa device info
>   vdpa/vdpa_sim: Enable user to create vdpasim net devices
> 
>  drivers/vdpa/Kconfig              |   1 +
>  drivers/vdpa/ifcvf/ifcvf_main.c   |   2 +-
>  drivers/vdpa/mlx5/net/mlx5_vnet.c |   2 +-
>  drivers/vdpa/vdpa.c               | 511 +++++++++++++++++++++++++++++-
>  drivers/vdpa/vdpa_sim/vdpa_sim.c  |  81 ++++-
>  include/linux/vdpa.h              |  46 ++-
>  include/uapi/linux/vdpa.h         |  41 +++
>  7 files changed, 660 insertions(+), 24 deletions(-)
>  create mode 100644 include/uapi/linux/vdpa.h
> 
> -- 
> 2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH 0/7] Introduce vdpa management tool
       [not found]       ` <20201216080610.08541f44@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
@ 2020-12-16 16:54         ` Parav Pandit
  2020-12-16 19:57           ` Michael S. Tsirkin
  0 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2020-12-16 16:54 UTC (permalink / raw)
  To: Jakub Kicinski, Michael S. Tsirkin
  Cc: virtualization, Eli Cohen, Jakub Kicinski, netdev

> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Wednesday, December 16, 2020 9:36 PM
> 
> On Wed, 16 Dec 2020 04:13:51 -0500 Michael S. Tsirkin wrote:
> > > > > 3. Why not use ioctl() interface?
> > > >
> > > > Obviously I'm gonna ask you - why can't you use devlink?
> > > >
> > > This was considered.
> > > However it seems that extending devlink for vdpa specific stats, devices,
> config sounds overloading devlink beyond its defined scope.
> >
> > kuba what's your thinking here? Should I merge this as is?
> 
> No objections from me if people familiar with VDPA like it.

I was too occupied with the recent work on subfunction series.
I wanted to change the "parentdev" to "mgmtdev" to make it little more clear for vdpa management tool to see vdpa mgmt device and operate on it.
What do you think? Should I revise v2 or its late?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/7] Introduce vdpa management tool
  2020-12-16 16:54         ` Parav Pandit
@ 2020-12-16 19:57           ` Michael S. Tsirkin
  2020-12-17 12:13             ` Parav Pandit
  0 siblings, 1 reply; 79+ messages in thread
From: Michael S. Tsirkin @ 2020-12-16 19:57 UTC (permalink / raw)
  To: Parav Pandit; +Cc: Jakub Kicinski, virtualization, Eli Cohen, netdev

On Wed, Dec 16, 2020 at 04:54:37PM +0000, Parav Pandit wrote:
> > From: Jakub Kicinski <kuba@kernel.org>
> > Sent: Wednesday, December 16, 2020 9:36 PM
> > 
> > On Wed, 16 Dec 2020 04:13:51 -0500 Michael S. Tsirkin wrote:
> > > > > > 3. Why not use ioctl() interface?
> > > > >
> > > > > Obviously I'm gonna ask you - why can't you use devlink?
> > > > >
> > > > This was considered.
> > > > However it seems that extending devlink for vdpa specific stats, devices,
> > config sounds overloading devlink beyond its defined scope.
> > >
> > > kuba what's your thinking here? Should I merge this as is?
> > 
> > No objections from me if people familiar with VDPA like it.
> 
> I was too occupied with the recent work on subfunction series.
> I wanted to change the "parentdev" to "mgmtdev" to make it little more clear for vdpa management tool to see vdpa mgmt device and operate on it.
> What do you think? Should I revise v2 or its late?

I need a rebase anyway, so sure.

-- 
MST


^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH 0/7] Introduce vdpa management tool
  2020-12-16 19:57           ` Michael S. Tsirkin
@ 2020-12-17 12:13             ` Parav Pandit
  0 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2020-12-17 12:13 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Jakub Kicinski, virtualization, Eli Cohen, netdev



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, December 17, 2020 1:28 AM
> 
> On Wed, Dec 16, 2020 at 04:54:37PM +0000, Parav Pandit wrote:
> > > From: Jakub Kicinski <kuba@kernel.org>
> > > Sent: Wednesday, December 16, 2020 9:36 PM
> > >
> > > On Wed, 16 Dec 2020 04:13:51 -0500 Michael S. Tsirkin wrote:
> > > > > > > 3. Why not use ioctl() interface?
> > > > > >
> > > > > > Obviously I'm gonna ask you - why can't you use devlink?
> > > > > >
> > > > > This was considered.
> > > > > However it seems that extending devlink for vdpa specific stats,
> devices,
> > > config sounds overloading devlink beyond its defined scope.
> > > >
> > > > kuba what's your thinking here? Should I merge this as is?
> > >
> > > No objections from me if people familiar with VDPA like it.
> >
> > I was too occupied with the recent work on subfunction series.
> > I wanted to change the "parentdev" to "mgmtdev" to make it little more
> clear for vdpa management tool to see vdpa mgmt device and operate on it.
> > What do you think? Should I revise v2 or its late?
> 
> I need a rebase anyway, so sure.
ok. Thanks.


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v2 0/7] Introduce vdpa management tool
  2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
                   ` (10 preceding siblings ...)
  2020-12-16  9:16 ` Michael S. Tsirkin
@ 2021-01-04  3:31 ` Parav Pandit
  2021-01-04  3:31   ` [PATCH linux-next v2 1/7] vdpa_sim_net: Make mac address array static Parav Pandit
                     ` (6 more replies)
  2021-01-05 10:31 ` [PATCH linux-next v3 0/6] Introduce vdpa management tool Parav Pandit
  12 siblings, 7 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-04  3:31 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

This patchset covers user requirements for managing existing vdpa devices,
using a tool and its internal design notes for kernel drivers.

Background and user requirements:
----------------------------------
(1) Currently VDPA device is created by driver when driver is loaded.
However, user should have a choice when to create or not create a vdpa device
for the underlying management device.

For example, mlx5 PCI VF and subfunction device supports multiple classes of
device such netdev, vdpa, rdma. Howevever it is not required to always created
vdpa device for such device.

(2) In another use case, a device may support creating one or multiple vdpa
device of same or different class such as net and block.
Creating vdpa devices at driver load time further limits this use case.

(3) A user should be able to monitor and query vdpa queue level or device level
statistics for a given vdpa device.

(4) A user should be able to query what class of vdpa devices are supported
by its management device.

(5) A user should be able to view supported features and negotiated features
of the vdpa device.

(6) A user should be able to create a vdpa device in vendor agnostic manner
using single tool.

Hence, it is required to have a tool through which user can create one or more
vdpa devices from a management device which addresses above user requirements.

Example devices:
----------------
 +-----------+ +-----------+ +---------+ +--------+ +-----------+ 
 |vdpa dev 0 | |vdpa dev 1 | |rdma dev | |netdev  | |vdpa dev 3 |
 |type=net   | |type=net   | |mlx5_0   | |ens3f0  | |type=net   |
 +----+------+ +-----+-----+ +----+----+ +-----+--+ +----+------+
      |              |            |            |         |
      |              |            |            |         |
 +----+-----+        |       +----+----+       |    +----+----+
 |  mlx5    +--------+       |mlx5     +-------+    |mlx5     |
 |pci vf 2  |                |pci vf 4 |            |pci sf 8 |
 |03:00:2   |                |03:00.4  |            |mlx5_sf.8|
 +----+-----+                +----+----+            +----+----+
      |                           |                      |
      |                      +----+-----+                |
      +----------------------+mlx5      +----------------+
                             |pci pf 0  |
                             |03:00.0   |
                             +----------+

vdpa tool:
----------
vdpa tool is a tool to create, delete vdpa devices from a management device.
It is a tool that enables user to query statistics, features and may be more
attributes in future.

vdpa tool command draft:
------------------------
(a) List management devices which support creating vdpa devices.
It also shows which class types supported by this management device.
In below command example four management devices support vdpa device creation.
First is PCI VF whose bdf is 03.00:2.
Second is PCI VF whose name is 03:00.4.
Third is PCI SF whose name is mlx5_core.sf.8

$ vdpa mgmtdev list
vdpasim_net
  supported_classes
    net
pci/0000:03.00:0
  supported_classes
    net
pci/0000:03.00:4
  supported_classes
    net
auxiliary/mlx5_core.sf.8
  supported_classes
    net

(b) Now add a vdpa device of networking class and show the device.
$ vdpa dev add mgmtdev pci/0000:03.00:0 name foo0

$ vdpa dev show foo0
foo0: mgmtdev pci/0000:03.00:2 type network vendor_id 0 max_vqs 2 max_vq_size 256

(c) Show features of a vdpa device
$ vdpa dev features show foo0
supported
  iommu platform
  version 1

(d) Dump vdpa device statistics
$ vdpa dev stats show foo0
kickdoorbells 10
wqes 100

(e) Now delete a vdpa device previously created.
$ vdpa dev del foo0

vdpa tool support in this patchset:
-----------------------------------
vdpa tool is created to create, delete and query vdpa devices.
examples:
Show vdpa management device that supports creating, deleting vdpa devices.

$ vdpa mgmtdev show
vdpasim_net:
  supported_classes
    net

$ vdpa mgmtdev show -jp
{
    "show": {
       "vdpasim_net": {
          "supported_classes": {
             "net"
        }
    }
}

Create a vdpa device of type networking named as "foo2" from the
management device vdpasim:

$ vdpa dev add mgmtdev vdpasim type net name foo2

Show the newly created vdpa device by its name:
$ vdpa dev show foo2
foo2: type network mgmtdev vdpasim vendor_id 0 max_vqs 2 max_vq_size 256

$ vdpa dev show foo2 -jp
{
    "dev": {
        "foo2": {
            "type": "network",
            "mgmtdev": "vdpasim_net",
            "vendor_id": 0,
            "max_vqs": 2,
            "max_vq_size": 256
        }
    }
}

Delete the vdpa device after its use:
$ vdpa dev del foo2

vdpa tool support by kernel:
----------------------------
vdpa tool user interface will be supported by existing vdpa kernel framework,
i.e. drivers/vdpa/vdpa.c It services user command through a netlink interface.

Each management device registers supported callback operations with vdpa
subsystem through which vdpa device(s) can be managed.

Patch summary:
--------------
Patch-1 Makes mac address array static
Patch-2 adds module parameter to avoid breaking backward compatibility
        for default device
Patch-3 Extends API to accept vdpa device name
Patch-4 Defines management device interface
Patch-5 Extends netlink interface to add, delete vdpa devices
Patch-6 Extends netlink interface to query vdpa device attributes
Patch-7 Extends vdpa_sim_net driver to add/delete simulated vdpa devices

FAQs:
-----
1. Where does userspace vdpa tool reside which users can use?
Ans: vdpa tool can possibly reside in iproute2 [1] as it enables user to
handler vdpa network devices.

2. Why not create and delete vdpa device using sysfs/configfs?
Ans:
(a) A device creation may involve passing one or more attributes.
Passing multiple attributes and returning error code and more verbose
information for invalid attributes cannot be handled by sysfs/configfs.

(b) netlink framework is rich that enables user space and kernel driver to
provide nested attributes.

(c) Exposing device specific file under sysfs without net namespace
awareness exposes details to multiple containers. Instead exposing
attributes via a netlink socket secures the communication channel with kernel.

(d) netlink socket interface enables to run syscaller kernel tests.

3. Why not use ioctl() interface?
Ans: ioctl() interface replicates the necessary plumbing which already
exists through netlink socket.

4. What happens when one or more user created vdpa devices exist for a
management PCI VF or SF and such management device is removed?
Ans: All user created vdpa devices are removed that belong to a
management device.

[1] git://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git

Next steps:
-----------
(a) Post this patchset and iproute2/vdpa inclusion, remaining two drivers
will be coverted to support vdpa tool instead of creating unmanaged default
device on driver load.
(b) More net specific parameters such as mac, mtu will be added.
(c) Features bits get and set interface will be added.

Parav Pandit (7):
  vdpa_sim_net: Make mac address array static
  vdpa_sim_net: Add module param to disable default vdpa net device
  vdpa: Extend routine to accept vdpa device name
  vdpa: Define vdpa mgmt device, ops and a netlink interface
  vdpa: Enable a user to add and delete a vdpa device
  vdpa: Enable user to query vdpa device info
  vdpa_sim_net: Add support for user supported devices

 drivers/vdpa/Kconfig                 |   1 +
 drivers/vdpa/ifcvf/ifcvf_main.c      |   2 +-
 drivers/vdpa/mlx5/net/mlx5_vnet.c    |   2 +-
 drivers/vdpa/vdpa.c                  | 503 ++++++++++++++++++++++++++-
 drivers/vdpa/vdpa_sim/vdpa_sim.c     |   3 +-
 drivers/vdpa/vdpa_sim/vdpa_sim.h     |   2 +
 drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 135 ++++++-
 include/linux/vdpa.h                 |  44 ++-
 include/uapi/linux/vdpa.h            |  40 +++
 9 files changed, 707 insertions(+), 25 deletions(-)
 create mode 100644 include/uapi/linux/vdpa.h

-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v2 1/7] vdpa_sim_net: Make mac address array static
  2021-01-04  3:31 ` [PATCH linux-next v2 " Parav Pandit
@ 2021-01-04  3:31   ` Parav Pandit
  2021-01-04  7:00     ` Jason Wang
  2021-01-04  3:31   ` [PATCH linux-next v2 2/7] vdpa_sim_net: Add module param to disable default vdpa net device Parav Pandit
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-04  3:31 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

MAC address array is used only in vdpa_sim_net.c.
Hence, keep it static.

Signed-off-by: Parav Pandit <parav@nvidia.com>
---
Changelog:
v1->v2:
 - new patch
---
 drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
index c10b6981fdab..f0482427186b 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -33,7 +33,7 @@ static char *macaddr;
 module_param(macaddr, charp, 0);
 MODULE_PARM_DESC(macaddr, "Ethernet MAC address");
 
-u8 macaddr_buf[ETH_ALEN];
+static u8 macaddr_buf[ETH_ALEN];
 
 static struct vdpasim *vdpasim_net_dev;
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v2 2/7] vdpa_sim_net: Add module param to disable default vdpa net device
  2021-01-04  3:31 ` [PATCH linux-next v2 " Parav Pandit
  2021-01-04  3:31   ` [PATCH linux-next v2 1/7] vdpa_sim_net: Make mac address array static Parav Pandit
@ 2021-01-04  3:31   ` Parav Pandit
  2021-01-04  3:31   ` [PATCH linux-next v2 3/7] vdpa: Extend routine to accept vdpa device name Parav Pandit
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-04  3:31 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

To support creating multiple vdpa devices and to allow user to
manage them, add a knob to disable a default vdpa net device.

Signed-off-by: Parav Pandit <parav@nvidia.com>
---
Changelog:
v1->v2:
 - new patch
---
 drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 41 ++++++++++++++++++++--------
 1 file changed, 29 insertions(+), 12 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
index f0482427186b..34155831538c 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -33,6 +33,10 @@ static char *macaddr;
 module_param(macaddr, charp, 0);
 MODULE_PARM_DESC(macaddr, "Ethernet MAC address");
 
+static bool default_device = true;
+module_param(default_device, bool, 0);
+MODULE_PARM_DESC(default_device, "Support single default VDPA device");
+
 static u8 macaddr_buf[ETH_ALEN];
 
 static struct vdpasim *vdpasim_net_dev;
@@ -120,21 +124,11 @@ static void vdpasim_net_get_config(struct vdpasim *vdpasim, void *config)
 	memcpy(net_config->mac, macaddr_buf, ETH_ALEN);
 }
 
-static int __init vdpasim_net_init(void)
+static int vdpasim_net_default_dev_register(void)
 {
 	struct vdpasim_dev_attr dev_attr = {};
 	int ret;
 
-	if (macaddr) {
-		mac_pton(macaddr, macaddr_buf);
-		if (!is_valid_ether_addr(macaddr_buf)) {
-			ret = -EADDRNOTAVAIL;
-			goto out;
-		}
-	} else {
-		eth_random_addr(macaddr_buf);
-	}
-
 	dev_attr.id = VIRTIO_ID_NET;
 	dev_attr.supported_features = VDPASIM_NET_FEATURES;
 	dev_attr.nvqs = VDPASIM_NET_VQ_NUM;
@@ -161,13 +155,36 @@ static int __init vdpasim_net_init(void)
 	return ret;
 }
 
-static void __exit vdpasim_net_exit(void)
+static void vdpasim_net_default_dev_unregister(void)
 {
 	struct vdpa_device *vdpa = &vdpasim_net_dev->vdpa;
 
 	vdpa_unregister_device(vdpa);
 }
 
+static int __init vdpasim_net_init(void)
+{
+	int ret = 0;
+
+	if (macaddr) {
+		mac_pton(macaddr, macaddr_buf);
+		if (!is_valid_ether_addr(macaddr_buf))
+			return -EADDRNOTAVAIL;
+	} else {
+		eth_random_addr(macaddr_buf);
+	}
+
+	if (default_device)
+		ret = vdpasim_net_default_dev_register();
+	return ret;
+}
+
+static void __exit vdpasim_net_exit(void)
+{
+	if (default_device)
+		vdpasim_net_default_dev_unregister();
+}
+
 module_init(vdpasim_net_init);
 module_exit(vdpasim_net_exit);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v2 3/7] vdpa: Extend routine to accept vdpa device name
  2021-01-04  3:31 ` [PATCH linux-next v2 " Parav Pandit
  2021-01-04  3:31   ` [PATCH linux-next v2 1/7] vdpa_sim_net: Make mac address array static Parav Pandit
  2021-01-04  3:31   ` [PATCH linux-next v2 2/7] vdpa_sim_net: Add module param to disable default vdpa net device Parav Pandit
@ 2021-01-04  3:31   ` Parav Pandit
  2021-01-04  3:31   ` [PATCH linux-next v2 4/7] vdpa: Define vdpa mgmt device, ops and a netlink interface Parav Pandit
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-04  3:31 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

In a subsequent patch, when user initiated command creates a vdpa device,
the user chooses the name of the vdpa device.
To support it, extend the device allocation API to consider this name
specified by the caller driver.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
Changelog:
v1->v2:
 - rebased
---
 drivers/vdpa/ifcvf/ifcvf_main.c   |  2 +-
 drivers/vdpa/mlx5/net/mlx5_vnet.c |  2 +-
 drivers/vdpa/vdpa.c               | 36 +++++++++++++++++++++++++++----
 drivers/vdpa/vdpa_sim/vdpa_sim.c  |  2 +-
 include/linux/vdpa.h              |  7 +++---
 5 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index fa1af301cf55..7c8bbfcf6c3e 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -432,7 +432,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
 				    dev, &ifc_vdpa_ops,
-				    IFCVF_MAX_QUEUE_PAIRS * 2);
+				    IFCVF_MAX_QUEUE_PAIRS * 2, NULL);
 	if (adapter == NULL) {
 		IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
 		return -ENOMEM;
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 81b932f72e10..5920290521cf 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1946,7 +1946,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
 	max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);
 
 	ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
-				 2 * mlx5_vdpa_max_qps(max_vqs));
+				 2 * mlx5_vdpa_max_qps(max_vqs), NULL);
 	if (IS_ERR(ndev))
 		return ndev;
 
diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index c0825650c055..7414bbd9057c 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -12,6 +12,8 @@
 #include <linux/slab.h>
 #include <linux/vdpa.h>
 
+/* A global mutex that protects vdpa management device and device level operations. */
+static DEFINE_MUTEX(vdpa_dev_mutex);
 static DEFINE_IDA(vdpa_index_ida);
 
 static int vdpa_dev_probe(struct device *d)
@@ -63,6 +65,7 @@ static void vdpa_release_dev(struct device *d)
  * @config: the bus operations that is supported by this device
  * @nvqs: number of virtqueues supported by this device
  * @size: size of the parent structure that contains private data
+ * @name: name of the vdpa device; optional.
  *
  * Driver should use vdpa_alloc_device() wrapper macro instead of
  * using this directly.
@@ -72,8 +75,7 @@ static void vdpa_release_dev(struct device *d)
  */
 struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 					const struct vdpa_config_ops *config,
-					int nvqs,
-					size_t size)
+					int nvqs, size_t size, const char *name)
 {
 	struct vdpa_device *vdev;
 	int err = -EINVAL;
@@ -101,7 +103,10 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 	vdev->features_valid = false;
 	vdev->nvqs = nvqs;
 
-	err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
+	if (name)
+		err = dev_set_name(&vdev->dev, "%s", name);
+	else
+		err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
 	if (err)
 		goto err_name;
 
@@ -118,6 +123,13 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 }
 EXPORT_SYMBOL_GPL(__vdpa_alloc_device);
 
+static int vdpa_name_match(struct device *dev, const void *data)
+{
+	struct vdpa_device *vdev = container_of(dev, struct vdpa_device, dev);
+
+	return (strcmp(dev_name(&vdev->dev), data) == 0);
+}
+
 /**
  * vdpa_register_device - register a vDPA device
  * Callers must have a succeed call of vdpa_alloc_device() before.
@@ -127,7 +139,21 @@ EXPORT_SYMBOL_GPL(__vdpa_alloc_device);
  */
 int vdpa_register_device(struct vdpa_device *vdev)
 {
-	return device_add(&vdev->dev);
+	struct device *dev;
+	int err;
+
+	mutex_lock(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, dev_name(&vdev->dev), vdpa_name_match);
+	if (dev) {
+		put_device(dev);
+		err = -EEXIST;
+		goto name_err;
+	}
+
+	err = device_add(&vdev->dev);
+name_err:
+	mutex_unlock(&vdpa_dev_mutex);
+	return err;
 }
 EXPORT_SYMBOL_GPL(vdpa_register_device);
 
@@ -137,7 +163,9 @@ EXPORT_SYMBOL_GPL(vdpa_register_device);
  */
 void vdpa_unregister_device(struct vdpa_device *vdev)
 {
+	mutex_lock(&vdpa_dev_mutex);
 	device_unregister(&vdev->dev);
+	mutex_unlock(&vdpa_dev_mutex);
 }
 EXPORT_SYMBOL_GPL(vdpa_unregister_device);
 
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index b3fcc67bfdf0..db1636a99ba4 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -235,7 +235,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
 		ops = &vdpasim_config_ops;
 
 	vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
-				    dev_attr->nvqs);
+				    dev_attr->nvqs, NULL);
 	if (!vdpasim)
 		goto err_alloc;
 
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 0fefeb976877..5700baa22356 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -245,15 +245,14 @@ struct vdpa_config_ops {
 
 struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 					const struct vdpa_config_ops *config,
-					int nvqs,
-					size_t size);
+					int nvqs, size_t size, const char *name);
 
-#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs)   \
+#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, name)   \
 			  container_of(__vdpa_alloc_device( \
 				       parent, config, nvqs, \
 				       sizeof(dev_struct) + \
 				       BUILD_BUG_ON_ZERO(offsetof( \
-				       dev_struct, member))), \
+				       dev_struct, member)), name), \
 				       dev_struct, member)
 
 int vdpa_register_device(struct vdpa_device *vdev);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v2 4/7] vdpa: Define vdpa mgmt device, ops and a netlink interface
  2021-01-04  3:31 ` [PATCH linux-next v2 " Parav Pandit
                     ` (2 preceding siblings ...)
  2021-01-04  3:31   ` [PATCH linux-next v2 3/7] vdpa: Extend routine to accept vdpa device name Parav Pandit
@ 2021-01-04  3:31   ` Parav Pandit
  2021-01-04  7:03     ` Jason Wang
  2021-01-04  3:31   ` [PATCH linux-next v2 5/7] vdpa: Enable a user to add and delete a vdpa device Parav Pandit
                     ` (2 subsequent siblings)
  6 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-04  3:31 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

To add one or more VDPA devices, define a management device which
allows adding or removing vdpa device. A management device defines
set of callbacks to manage vdpa devices.

To begin with, it defines add and remove callbacks through which a user
defined vdpa device can be added or removed.

A unique management device is identified by its unique handle identified
by management device name and optionally the bus name.

Hence, introduce routine through which driver can register a
management device and its callback operations for adding and remove
a vdpa device.

Introduce vdpa netlink socket family so that user can query management
device and its attributes.

Example of show vdpa management device which allows creating vdpa device of
networking class (device id = 0x1) of virtio specification 1.1
section 5.1.1.

$ vdpa mgmtdev show
vdpasim_net:
  supported_classes:
    net

Example of showing vdpa management device in JSON format.

$ vdpa mgmtdev show -jp
{
    "show": {
        "vdpasim_net": {
            "supported_classes": [ "net" ]
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
---
Changelog:
v1->v2:
 - rebased
 - updated commit log example for management device name from
   "vdpasim" to "vdpasim_net"
 - removed device_id as net and block management devices are separated
 - dev_add() return type is changed from struct vdpa_device to int
---
 drivers/vdpa/Kconfig      |   1 +
 drivers/vdpa/vdpa.c       | 213 +++++++++++++++++++++++++++++++++++++-
 include/linux/vdpa.h      |  31 ++++++
 include/uapi/linux/vdpa.h |  31 ++++++
 4 files changed, 275 insertions(+), 1 deletion(-)
 create mode 100644 include/uapi/linux/vdpa.h

diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index 92a6396f8a73..ffd1e098bfd2 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 menuconfig VDPA
 	tristate "vDPA drivers"
+	depends on NET
 	help
 	  Enable this module to support vDPA device that uses a
 	  datapath which complies with virtio specifications with
diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 7414bbd9057c..319d09709dfc 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -11,11 +11,17 @@
 #include <linux/idr.h>
 #include <linux/slab.h>
 #include <linux/vdpa.h>
+#include <uapi/linux/vdpa.h>
+#include <net/genetlink.h>
+#include <linux/mod_devicetable.h>
 
+static LIST_HEAD(mdev_head);
 /* A global mutex that protects vdpa management device and device level operations. */
 static DEFINE_MUTEX(vdpa_dev_mutex);
 static DEFINE_IDA(vdpa_index_ida);
 
+static struct genl_family vdpa_nl_family;
+
 static int vdpa_dev_probe(struct device *d)
 {
 	struct vdpa_device *vdev = dev_to_vdpa(d);
@@ -195,13 +201,218 @@ void vdpa_unregister_driver(struct vdpa_driver *drv)
 }
 EXPORT_SYMBOL_GPL(vdpa_unregister_driver);
 
+/**
+ * vdpa_mgmtdev_register - register a vdpa management device
+ *
+ * @mdev: Pointer to vdpa management device
+ * vdpa_mgmtdev_register() register a vdpa management device which supports
+ * vdpa device management.
+ */
+int vdpa_mgmtdev_register(struct vdpa_mgmt_dev *mdev)
+{
+	if (!mdev->device || !mdev->ops || !mdev->ops->dev_add || !mdev->ops->dev_del)
+		return -EINVAL;
+
+	INIT_LIST_HEAD(&mdev->list);
+	mutex_lock(&vdpa_dev_mutex);
+	list_add_tail(&mdev->list, &mdev_head);
+	mutex_unlock(&vdpa_dev_mutex);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vdpa_mgmtdev_register);
+
+void vdpa_mgmtdev_unregister(struct vdpa_mgmt_dev *mdev)
+{
+	mutex_lock(&vdpa_dev_mutex);
+	list_del(&mdev->list);
+	mutex_unlock(&vdpa_dev_mutex);
+}
+EXPORT_SYMBOL_GPL(vdpa_mgmtdev_unregister);
+
+static bool mgmtdev_handle_match(const struct vdpa_mgmt_dev *mdev,
+				 const char *busname, const char *devname)
+{
+	/* Bus name is optional for simulated management device, so ignore the
+	 * device with bus if bus attribute is provided.
+	 */
+	if ((busname && !mdev->device->bus) || (!busname && mdev->device->bus))
+		return false;
+
+	if (!busname && strcmp(dev_name(mdev->device), devname) == 0)
+		return true;
+
+	if (busname && (strcmp(mdev->device->bus->name, busname) == 0) &&
+	    (strcmp(dev_name(mdev->device), devname) == 0))
+		return true;
+
+	return false;
+}
+
+static struct vdpa_mgmt_dev *vdpa_mgmtdev_get_from_attr(struct nlattr **attrs)
+{
+	struct vdpa_mgmt_dev *mdev;
+	const char *busname = NULL;
+	const char *devname;
+
+	if (!attrs[VDPA_ATTR_MGMTDEV_DEV_NAME])
+		return ERR_PTR(-EINVAL);
+	devname = nla_data(attrs[VDPA_ATTR_MGMTDEV_DEV_NAME]);
+	if (attrs[VDPA_ATTR_MGMTDEV_BUS_NAME])
+		busname = nla_data(attrs[VDPA_ATTR_MGMTDEV_BUS_NAME]);
+
+	list_for_each_entry(mdev, &mdev_head, list) {
+		if (mgmtdev_handle_match(mdev, busname, devname))
+			return mdev;
+	}
+	return ERR_PTR(-ENODEV);
+}
+
+static int vdpa_nl_mgmtdev_handle_fill(struct sk_buff *msg, const struct vdpa_mgmt_dev *mdev)
+{
+	if (mdev->device->bus &&
+	    nla_put_string(msg, VDPA_ATTR_MGMTDEV_BUS_NAME, mdev->device->bus->name))
+		return -EMSGSIZE;
+	if (nla_put_string(msg, VDPA_ATTR_MGMTDEV_DEV_NAME, dev_name(mdev->device)))
+		return -EMSGSIZE;
+	return 0;
+}
+
+static int vdpa_mgmtdev_fill(const struct vdpa_mgmt_dev *mdev, struct sk_buff *msg,
+			     u32 portid, u32 seq, int flags)
+{
+	u64 supported_classes = 0;
+	void *hdr;
+	int i = 0;
+	int err;
+
+	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags, VDPA_CMD_MGMTDEV_NEW);
+	if (!hdr)
+		return -EMSGSIZE;
+	err = vdpa_nl_mgmtdev_handle_fill(msg, mdev);
+	if (err)
+		goto msg_err;
+
+	while (mdev->id_table[i].device) {
+		supported_classes |= BIT(mdev->id_table[i].device);
+		i++;
+	}
+
+	if (nla_put_u64_64bit(msg, VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES,
+			      supported_classes, VDPA_ATTR_UNSPEC)) {
+		err = -EMSGSIZE;
+		goto msg_err;
+	}
+
+	genlmsg_end(msg, hdr);
+	return 0;
+
+msg_err:
+	genlmsg_cancel(msg, hdr);
+	return err;
+}
+
+static int vdpa_nl_cmd_mgmtdev_get_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_mgmt_dev *mdev;
+	struct sk_buff *msg;
+	int err;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	mutex_lock(&vdpa_dev_mutex);
+	mdev = vdpa_mgmtdev_get_from_attr(info->attrs);
+	if (IS_ERR(mdev)) {
+		mutex_unlock(&vdpa_dev_mutex);
+		NL_SET_ERR_MSG_MOD(info->extack, "Fail to find the specified mgmt device");
+		err = PTR_ERR(mdev);
+		goto out;
+	}
+
+	err = vdpa_mgmtdev_fill(mdev, msg, info->snd_portid, info->snd_seq, 0);
+	mutex_unlock(&vdpa_dev_mutex);
+	if (err)
+		goto out;
+	err = genlmsg_reply(msg, info);
+	return err;
+
+out:
+	nlmsg_free(msg);
+	return err;
+}
+
+static int
+vdpa_nl_cmd_mgmtdev_get_dumpit(struct sk_buff *msg, struct netlink_callback *cb)
+{
+	struct vdpa_mgmt_dev *mdev;
+	int start = cb->args[0];
+	int idx = 0;
+	int err;
+
+	mutex_lock(&vdpa_dev_mutex);
+	list_for_each_entry(mdev, &mdev_head, list) {
+		if (idx < start) {
+			idx++;
+			continue;
+		}
+		err = vdpa_mgmtdev_fill(mdev, msg, NETLINK_CB(cb->skb).portid,
+					cb->nlh->nlmsg_seq, NLM_F_MULTI);
+		if (err)
+			goto out;
+		idx++;
+	}
+out:
+	mutex_unlock(&vdpa_dev_mutex);
+	cb->args[0] = idx;
+	return msg->len;
+}
+
+static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX] = {
+	[VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type = NLA_NUL_STRING },
+	[VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING },
+};
+
+static const struct genl_ops vdpa_nl_ops[] = {
+	{
+		.cmd = VDPA_CMD_MGMTDEV_GET,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_mgmtdev_get_doit,
+		.dumpit = vdpa_nl_cmd_mgmtdev_get_dumpit,
+	},
+};
+
+static struct genl_family vdpa_nl_family __ro_after_init = {
+	.name = VDPA_GENL_NAME,
+	.version = VDPA_GENL_VERSION,
+	.maxattr = VDPA_ATTR_MAX,
+	.policy = vdpa_nl_policy,
+	.netnsok = false,
+	.module = THIS_MODULE,
+	.ops = vdpa_nl_ops,
+	.n_ops = ARRAY_SIZE(vdpa_nl_ops),
+};
+
 static int vdpa_init(void)
 {
-	return bus_register(&vdpa_bus);
+	int err;
+
+	err = bus_register(&vdpa_bus);
+	if (err)
+		return err;
+	err = genl_register_family(&vdpa_nl_family);
+	if (err)
+		goto err;
+	return 0;
+
+err:
+	bus_unregister(&vdpa_bus);
+	return err;
 }
 
 static void __exit vdpa_exit(void)
 {
+	genl_unregister_family(&vdpa_nl_family);
 	bus_unregister(&vdpa_bus);
 	ida_destroy(&vdpa_index_ida);
 }
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 5700baa22356..6b8b4222bca6 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -35,6 +35,8 @@ struct vdpa_vq_state {
 	u16	avail_index;
 };
 
+struct vdpa_mgmt_dev;
+
 /**
  * vDPA device - representation of a vDPA device
  * @dev: underlying device
@@ -335,4 +337,33 @@ static inline void vdpa_get_config(struct vdpa_device *vdev, unsigned offset,
 	ops->get_config(vdev, offset, buf, len);
 }
 
+/**
+ * vdpa_mgmtdev_ops - vdpa device ops
+ * @dev_add:	Add a vdpa device using alloc and register
+ *		@mdev: parent device to use for device addition
+ *		@name: name of the new vdpa device
+ *		Driver need to add a new device using _vdpa_register_device()
+ *		after fully initializing the vdpa device. Driver must return 0
+ *		on success or appropriate error code.
+ * @dev_del:	Remove a vdpa device using unregister
+ *		@mdev: parent device to use for device removal
+ *		@dev: vdpa device to remove
+ *		Driver need to remove the specified device by calling
+ *		_vdpa_unregister_device().
+ */
+struct vdpa_mgmtdev_ops {
+	int (*dev_add)(struct vdpa_mgmt_dev *mdev, const char *name);
+	void (*dev_del)(struct vdpa_mgmt_dev *mdev, struct vdpa_device *dev);
+};
+
+struct vdpa_mgmt_dev {
+	struct device *device;
+	const struct vdpa_mgmtdev_ops *ops;
+	const struct virtio_device_id *id_table; /* supported ids */
+	struct list_head list;
+};
+
+int vdpa_mgmtdev_register(struct vdpa_mgmt_dev *mdev);
+void vdpa_mgmtdev_unregister(struct vdpa_mgmt_dev *mdev);
+
 #endif /* _LINUX_VDPA_H */
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
new file mode 100644
index 000000000000..d44d82e567b1
--- /dev/null
+++ b/include/uapi/linux/vdpa.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/*
+ * vdpa device management interface
+ * Copyright (c) 2020 Mellanox Technologies Ltd. All rights reserved.
+ */
+
+#ifndef _UAPI_LINUX_VDPA_H_
+#define _UAPI_LINUX_VDPA_H_
+
+#define VDPA_GENL_NAME "vdpa"
+#define VDPA_GENL_VERSION 0x1
+
+enum vdpa_command {
+	VDPA_CMD_UNSPEC,
+	VDPA_CMD_MGMTDEV_NEW,
+	VDPA_CMD_MGMTDEV_GET,		/* can dump */
+};
+
+enum vdpa_attr {
+	VDPA_ATTR_UNSPEC,
+
+	/* bus name (optional) + dev name together make the parent device handle */
+	VDPA_ATTR_MGMTDEV_BUS_NAME,		/* string */
+	VDPA_ATTR_MGMTDEV_DEV_NAME,		/* string */
+	VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES,	/* u64 */
+
+	/* new attributes must be added above here */
+	VDPA_ATTR_MAX,
+};
+
+#endif
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v2 5/7] vdpa: Enable a user to add and delete a vdpa device
  2021-01-04  3:31 ` [PATCH linux-next v2 " Parav Pandit
                     ` (3 preceding siblings ...)
  2021-01-04  3:31   ` [PATCH linux-next v2 4/7] vdpa: Define vdpa mgmt device, ops and a netlink interface Parav Pandit
@ 2021-01-04  3:31   ` Parav Pandit
  2021-01-04  3:31   ` [PATCH linux-next v2 6/7] vdpa: Enable user to query vdpa device info Parav Pandit
  2021-01-04  3:31   ` [PATCH linux-next v2 7/7] vdpa_sim_net: Add support for user supported devices Parav Pandit
  6 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-04  3:31 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

Add the ability to add and delete a vdpa device.

Examples:
Create a vdpa device of type network named "foo2" from
the management device vdpasim:

$ vdpa dev add mgmtdev vdpasim_net name foo2

Delete the vdpa device after its use:
$ vdpa dev del foo2

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>

---
Changelog:
v1->v2:
 - using int return type for dev_add callback
 - removed device_id (type) as current drivers only supports single type
---
 drivers/vdpa/vdpa.c       | 143 +++++++++++++++++++++++++++++++++++---
 include/linux/vdpa.h      |   6 ++
 include/uapi/linux/vdpa.h |   4 ++
 3 files changed, 143 insertions(+), 10 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 319d09709dfc..dca67e4d32e5 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -136,6 +136,37 @@ static int vdpa_name_match(struct device *dev, const void *data)
 	return (strcmp(dev_name(&vdev->dev), data) == 0);
 }
 
+static int __vdpa_register_device(struct vdpa_device *vdev)
+{
+	struct device *dev;
+
+	lockdep_assert_held(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, dev_name(&vdev->dev), vdpa_name_match);
+	if (dev) {
+		put_device(dev);
+		return -EEXIST;
+	}
+	return device_add(&vdev->dev);
+}
+
+/**
+ * _vdpa_register_device - register a vDPA device with vdpa lock held
+ * Caller must have a succeed call of vdpa_alloc_device() before.
+ * Caller must invoke this routine in the management device dev_add()
+ * callback after setting up valid mgmtdev for this vdpa device.
+ * @vdev: the vdpa device to be registered to vDPA bus
+ *
+ * Returns an error when fail to add device to vDPA bus
+ */
+int _vdpa_register_device(struct vdpa_device *vdev)
+{
+	if (!vdev->mdev)
+		return -EINVAL;
+
+	return __vdpa_register_device(vdev);
+}
+EXPORT_SYMBOL_GPL(_vdpa_register_device);
+
 /**
  * vdpa_register_device - register a vDPA device
  * Callers must have a succeed call of vdpa_alloc_device() before.
@@ -145,24 +176,29 @@ static int vdpa_name_match(struct device *dev, const void *data)
  */
 int vdpa_register_device(struct vdpa_device *vdev)
 {
-	struct device *dev;
 	int err;
 
 	mutex_lock(&vdpa_dev_mutex);
-	dev = bus_find_device(&vdpa_bus, NULL, dev_name(&vdev->dev), vdpa_name_match);
-	if (dev) {
-		put_device(dev);
-		err = -EEXIST;
-		goto name_err;
-	}
-
-	err = device_add(&vdev->dev);
-name_err:
+	err = __vdpa_register_device(vdev);
 	mutex_unlock(&vdpa_dev_mutex);
 	return err;
 }
 EXPORT_SYMBOL_GPL(vdpa_register_device);
 
+/**
+ * _vdpa_unregister_device - unregister a vDPA device
+ * Caller must invoke this routine as part of management device dev_del()
+ * callback.
+ * @vdev: the vdpa device to be unregisted from vDPA bus
+ */
+void _vdpa_unregister_device(struct vdpa_device *vdev)
+{
+	lockdep_assert_held(&vdpa_dev_mutex);
+	WARN_ON(!vdev->mdev);
+	device_unregister(&vdev->dev);
+}
+EXPORT_SYMBOL_GPL(_vdpa_unregister_device);
+
 /**
  * vdpa_unregister_device - unregister a vDPA device
  * @vdev: the vdpa device to be unregisted from vDPA bus
@@ -221,10 +257,25 @@ int vdpa_mgmtdev_register(struct vdpa_mgmt_dev *mdev)
 }
 EXPORT_SYMBOL_GPL(vdpa_mgmtdev_register);
 
+static int vdpa_match_remove(struct device *dev, void *data)
+{
+	struct vdpa_device *vdev = container_of(dev, struct vdpa_device, dev);
+	struct vdpa_mgmt_dev *mdev = vdev->mdev;
+
+	if (mdev == data)
+		mdev->ops->dev_del(mdev, vdev);
+	return 0;
+}
+
 void vdpa_mgmtdev_unregister(struct vdpa_mgmt_dev *mdev)
 {
 	mutex_lock(&vdpa_dev_mutex);
+
 	list_del(&mdev->list);
+
+	/* Filter out all the entries belong to this management device and delete it. */
+	bus_for_each_dev(&vdpa_bus, NULL, mdev, vdpa_match_remove);
+
 	mutex_unlock(&vdpa_dev_mutex);
 }
 EXPORT_SYMBOL_GPL(vdpa_mgmtdev_unregister);
@@ -368,9 +419,69 @@ vdpa_nl_cmd_mgmtdev_get_dumpit(struct sk_buff *msg, struct netlink_callback *cb)
 	return msg->len;
 }
 
+static int vdpa_nl_cmd_dev_add_set_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_mgmt_dev *mdev;
+	const char *name;
+	int err = 0;
+
+	if (!info->attrs[VDPA_ATTR_DEV_NAME])
+		return -EINVAL;
+
+	name = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
+
+	mutex_lock(&vdpa_dev_mutex);
+	mdev = vdpa_mgmtdev_get_from_attr(info->attrs);
+	if (IS_ERR(mdev)) {
+		NL_SET_ERR_MSG_MOD(info->extack, "Fail to find the specified management device");
+		err = PTR_ERR(mdev);
+		goto err;
+	}
+
+	err = mdev->ops->dev_add(mdev, name);
+err:
+	mutex_unlock(&vdpa_dev_mutex);
+	return err;
+}
+
+static int vdpa_nl_cmd_dev_del_set_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_mgmt_dev *mdev;
+	struct vdpa_device *vdev;
+	struct device *dev;
+	const char *name;
+	int err = 0;
+
+	if (!info->attrs[VDPA_ATTR_DEV_NAME])
+		return -EINVAL;
+	name = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
+
+	mutex_lock(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, name, vdpa_name_match);
+	if (!dev) {
+		NL_SET_ERR_MSG_MOD(info->extack, "device not found");
+		err = -ENODEV;
+		goto dev_err;
+	}
+	vdev = container_of(dev, struct vdpa_device, dev);
+	if (!vdev->mdev) {
+		NL_SET_ERR_MSG_MOD(info->extack, "Only user created device can be deleted by user");
+		err = -EINVAL;
+		goto mdev_err;
+	}
+	mdev = vdev->mdev;
+	mdev->ops->dev_del(mdev, vdev);
+mdev_err:
+	put_device(dev);
+dev_err:
+	mutex_unlock(&vdpa_dev_mutex);
+	return err;
+}
+
 static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX] = {
 	[VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING },
+	[VDPA_ATTR_DEV_NAME] = { .type = NLA_STRING },
 };
 
 static const struct genl_ops vdpa_nl_ops[] = {
@@ -380,6 +491,18 @@ static const struct genl_ops vdpa_nl_ops[] = {
 		.doit = vdpa_nl_cmd_mgmtdev_get_doit,
 		.dumpit = vdpa_nl_cmd_mgmtdev_get_dumpit,
 	},
+	{
+		.cmd = VDPA_CMD_DEV_NEW,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_dev_add_set_doit,
+		.flags = GENL_ADMIN_PERM,
+	},
+	{
+		.cmd = VDPA_CMD_DEV_DEL,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_dev_del_set_doit,
+		.flags = GENL_ADMIN_PERM,
+	},
 };
 
 static struct genl_family vdpa_nl_family __ro_after_init = {
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 6b8b4222bca6..4ab5494503a8 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -45,6 +45,8 @@ struct vdpa_mgmt_dev;
  * @index: device index
  * @features_valid: were features initialized? for legacy guests
  * @nvqs: maximum number of supported virtqueues
+ * @mdev: management device pointer; caller must setup when registering device as part
+ *	  of dev_add() mgmtdev ops callback before invoking _vdpa_register_device().
  */
 struct vdpa_device {
 	struct device dev;
@@ -53,6 +55,7 @@ struct vdpa_device {
 	unsigned int index;
 	bool features_valid;
 	int nvqs;
+	struct vdpa_mgmt_dev *mdev;
 };
 
 /**
@@ -260,6 +263,9 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 int vdpa_register_device(struct vdpa_device *vdev);
 void vdpa_unregister_device(struct vdpa_device *vdev);
 
+int _vdpa_register_device(struct vdpa_device *vdev);
+void _vdpa_unregister_device(struct vdpa_device *vdev);
+
 /**
  * vdpa_driver - operations for a vDPA driver
  * @driver: underlying device driver
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index d44d82e567b1..bb4a1f00eb1c 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -14,6 +14,8 @@ enum vdpa_command {
 	VDPA_CMD_UNSPEC,
 	VDPA_CMD_MGMTDEV_NEW,
 	VDPA_CMD_MGMTDEV_GET,		/* can dump */
+	VDPA_CMD_DEV_NEW,
+	VDPA_CMD_DEV_DEL,
 };
 
 enum vdpa_attr {
@@ -24,6 +26,8 @@ enum vdpa_attr {
 	VDPA_ATTR_MGMTDEV_DEV_NAME,		/* string */
 	VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES,	/* u64 */
 
+	VDPA_ATTR_DEV_NAME,			/* string */
+
 	/* new attributes must be added above here */
 	VDPA_ATTR_MAX,
 };
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v2 6/7] vdpa: Enable user to query vdpa device info
  2021-01-04  3:31 ` [PATCH linux-next v2 " Parav Pandit
                     ` (4 preceding siblings ...)
  2021-01-04  3:31   ` [PATCH linux-next v2 5/7] vdpa: Enable a user to add and delete a vdpa device Parav Pandit
@ 2021-01-04  3:31   ` Parav Pandit
  2021-01-04  3:31   ` [PATCH linux-next v2 7/7] vdpa_sim_net: Add support for user supported devices Parav Pandit
  6 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-04  3:31 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

Enable user to query vdpa device information.

$ vdpa dev add mgmtdev vdpasim_net name foo2

Show the newly created vdpa device by its name:
$ vdpa dev show foo2
foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2 max_vq_size 256

$ vdpa dev show foo2 -jp
{
    "dev": {
        "foo2": {
            "type": "network",
            "mgmtdev": "vdpasim_net",
            "vendor_id": 0,
            "max_vqs": 2,
            "max_vq_size": 256
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vdpa/vdpa.c       | 131 ++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/vdpa.h |   5 ++
 2 files changed, 136 insertions(+)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index dca67e4d32e5..9700a0adcca0 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -478,6 +478,131 @@ static int vdpa_nl_cmd_dev_del_set_doit(struct sk_buff *skb, struct genl_info *i
 	return err;
 }
 
+static int
+vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq,
+	      int flags, struct netlink_ext_ack *extack)
+{
+	u16 max_vq_size;
+	u32 device_id;
+	u32 vendor_id;
+	void *hdr;
+	int err;
+
+	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags, VDPA_CMD_DEV_NEW);
+	if (!hdr)
+		return -EMSGSIZE;
+
+	err = vdpa_nl_mgmtdev_handle_fill(msg, vdev->mdev);
+	if (err)
+		goto msg_err;
+
+	device_id = vdev->config->get_device_id(vdev);
+	vendor_id = vdev->config->get_vendor_id(vdev);
+	max_vq_size = vdev->config->get_vq_num_max(vdev);
+
+	err = -EMSGSIZE;
+	if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
+		goto msg_err;
+	if (nla_put_u32(msg, VDPA_ATTR_DEV_ID, device_id))
+		goto msg_err;
+	if (nla_put_u32(msg, VDPA_ATTR_DEV_VENDOR_ID, vendor_id))
+		goto msg_err;
+	if (nla_put_u32(msg, VDPA_ATTR_DEV_MAX_VQS, vdev->nvqs))
+		goto msg_err;
+	if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
+		goto msg_err;
+
+	genlmsg_end(msg, hdr);
+	return 0;
+
+msg_err:
+	genlmsg_cancel(msg, hdr);
+	return err;
+}
+
+static int vdpa_nl_cmd_dev_get_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_device *vdev;
+	struct sk_buff *msg;
+	const char *devname;
+	struct device *dev;
+	int err;
+
+	if (!info->attrs[VDPA_ATTR_DEV_NAME])
+		return -EINVAL;
+	devname = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	mutex_lock(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, devname, vdpa_name_match);
+	if (!dev) {
+		mutex_unlock(&vdpa_dev_mutex);
+		NL_SET_ERR_MSG_MOD(info->extack, "device not found");
+		return -ENODEV;
+	}
+	vdev = container_of(dev, struct vdpa_device, dev);
+	if (!vdev->mdev) {
+		mutex_unlock(&vdpa_dev_mutex);
+		put_device(dev);
+		return -EINVAL;
+	}
+	err = vdpa_dev_fill(vdev, msg, info->snd_portid, info->snd_seq, 0, info->extack);
+	if (!err)
+		err = genlmsg_reply(msg, info);
+	put_device(dev);
+	mutex_unlock(&vdpa_dev_mutex);
+
+	if (err)
+		nlmsg_free(msg);
+	return err;
+}
+
+struct vdpa_dev_dump_info {
+	struct sk_buff *msg;
+	struct netlink_callback *cb;
+	int start_idx;
+	int idx;
+};
+
+static int vdpa_dev_dump(struct device *dev, void *data)
+{
+	struct vdpa_device *vdev = container_of(dev, struct vdpa_device, dev);
+	struct vdpa_dev_dump_info *info = data;
+	int err;
+
+	if (!vdev->mdev)
+		return 0;
+	if (info->idx < info->start_idx) {
+		info->idx++;
+		return 0;
+	}
+	err = vdpa_dev_fill(vdev, info->msg, NETLINK_CB(info->cb->skb).portid,
+			    info->cb->nlh->nlmsg_seq, NLM_F_MULTI, info->cb->extack);
+	if (err)
+		return err;
+
+	info->idx++;
+	return 0;
+}
+
+static int vdpa_nl_cmd_dev_get_dumpit(struct sk_buff *msg, struct netlink_callback *cb)
+{
+	struct vdpa_dev_dump_info info;
+
+	info.msg = msg;
+	info.cb = cb;
+	info.start_idx = cb->args[0];
+	info.idx = 0;
+
+	mutex_lock(&vdpa_dev_mutex);
+	bus_for_each_dev(&vdpa_bus, NULL, &info, vdpa_dev_dump);
+	mutex_unlock(&vdpa_dev_mutex);
+	cb->args[0] = info.idx;
+	return msg->len;
+}
+
 static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX] = {
 	[VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING },
@@ -503,6 +628,12 @@ static const struct genl_ops vdpa_nl_ops[] = {
 		.doit = vdpa_nl_cmd_dev_del_set_doit,
 		.flags = GENL_ADMIN_PERM,
 	},
+	{
+		.cmd = VDPA_CMD_DEV_GET,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_dev_get_doit,
+		.dumpit = vdpa_nl_cmd_dev_get_dumpit,
+	},
 };
 
 static struct genl_family vdpa_nl_family __ro_after_init = {
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index bb4a1f00eb1c..66a41e4ec163 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -16,6 +16,7 @@ enum vdpa_command {
 	VDPA_CMD_MGMTDEV_GET,		/* can dump */
 	VDPA_CMD_DEV_NEW,
 	VDPA_CMD_DEV_DEL,
+	VDPA_CMD_DEV_GET,		/* can dump */
 };
 
 enum vdpa_attr {
@@ -27,6 +28,10 @@ enum vdpa_attr {
 	VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES,	/* u64 */
 
 	VDPA_ATTR_DEV_NAME,			/* string */
+	VDPA_ATTR_DEV_ID,			/* u32 */
+	VDPA_ATTR_DEV_VENDOR_ID,		/* u32 */
+	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
+	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
 
 	/* new attributes must be added above here */
 	VDPA_ATTR_MAX,
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v2 7/7] vdpa_sim_net: Add support for user supported devices
  2021-01-04  3:31 ` [PATCH linux-next v2 " Parav Pandit
                     ` (5 preceding siblings ...)
  2021-01-04  3:31   ` [PATCH linux-next v2 6/7] vdpa: Enable user to query vdpa device info Parav Pandit
@ 2021-01-04  3:31   ` Parav Pandit
  2021-01-04  7:05     ` Jason Wang
  6 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-04  3:31 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

Enable user to create vdpasim net simulate devices.

Show vdpa management device that supports creating, deleting vdpa devices.

$ vdpa mgmtdev show
vdpasim_net:
  supported_classes
    net

$ vdpa mgmtdev show -jp
{
    "show": {
        "vdpasim_net": {
            "supported_classes": {
              "net"
        }
    }
}

Create a vdpa device of type networking named as "foo2" from
the management device vdpasim:

$ vdpa dev add mgmtdev vdpasim_net name foo2

Show the newly created vdpa device by its name:
$ vdpa dev show foo2
foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2 max_vq_size 256

$ vdpa dev show foo2 -jp
{
    "dev": {
        "foo2": {
            "type": "network",
            "mgmtdev": "vdpasim_net",
            "vendor_id": 0,
            "max_vqs": 2,
            "max_vq_size": 256
        }
    }
}

Delete the vdpa device after its use:
$ vdpa dev del foo2

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
Changelog:
v1->v2:
 - rebased
---
 drivers/vdpa/vdpa_sim/vdpa_sim.c     |  3 +-
 drivers/vdpa/vdpa_sim/vdpa_sim.h     |  2 +
 drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 92 ++++++++++++++++++++++++++++
 3 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index db1636a99ba4..d5942842432d 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -235,7 +235,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
 		ops = &vdpasim_config_ops;
 
 	vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
-				    dev_attr->nvqs, NULL);
+				    dev_attr->nvqs, dev_attr->name);
 	if (!vdpasim)
 		goto err_alloc;
 
@@ -249,6 +249,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
 	if (dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)))
 		goto err_iommu;
 	set_dma_ops(dev, &vdpasim_dma_ops);
+	vdpasim->vdpa.mdev = dev_attr->mgmt_dev;
 
 	vdpasim->config = kzalloc(dev_attr->config_size, GFP_KERNEL);
 	if (!vdpasim->config)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
index b02142293d5b..6d75444f9948 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
@@ -33,6 +33,8 @@ struct vdpasim_virtqueue {
 };
 
 struct vdpasim_dev_attr {
+	struct vdpa_mgmt_dev *mgmt_dev;
+	const char *name;
 	u64 supported_features;
 	size_t config_size;
 	size_t buffer_size;
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
index 34155831538c..b795e02bdad0 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -162,6 +162,94 @@ static void vdpasim_net_default_dev_unregister(void)
 	vdpa_unregister_device(vdpa);
 }
 
+static void vdpasim_net_mgmtdev_release(struct device *dev)
+{
+}
+
+static struct device vdpasim_net_mgmtdev = {
+	.init_name = "vdpasim_net",
+	.release = vdpasim_net_mgmtdev_release,
+};
+
+static int vdpasim_net_dev_add(struct vdpa_mgmt_dev *mdev, const char *name)
+{
+	struct vdpasim_dev_attr dev_attr = {};
+	struct vdpasim *simdev;
+	int ret;
+
+	dev_attr.mgmt_dev = mdev;
+	dev_attr.name = name;
+	dev_attr.id = VIRTIO_ID_NET;
+	dev_attr.supported_features = VDPASIM_NET_FEATURES;
+	dev_attr.nvqs = VDPASIM_NET_VQ_NUM;
+	dev_attr.config_size = sizeof(struct virtio_net_config);
+	dev_attr.get_config = vdpasim_net_get_config;
+	dev_attr.work_fn = vdpasim_net_work;
+	dev_attr.buffer_size = PAGE_SIZE;
+
+	simdev = vdpasim_create(&dev_attr);
+	if (IS_ERR(simdev))
+		return PTR_ERR(simdev);
+
+	ret = _vdpa_register_device(&simdev->vdpa);
+	if (ret)
+		goto reg_err;
+
+	return 0;
+
+reg_err:
+	put_device(&simdev->vdpa.dev);
+	return ret;
+}
+
+static void vdpasim_net_dev_del(struct vdpa_mgmt_dev *mdev,
+				struct vdpa_device *dev)
+{
+	struct vdpasim *simdev = container_of(dev, struct vdpasim, vdpa);
+
+	_vdpa_unregister_device(&simdev->vdpa);
+}
+
+static const struct vdpa_mgmtdev_ops vdpasim_net_mgmtdev_ops = {
+	.dev_add = vdpasim_net_dev_add,
+	.dev_del = vdpasim_net_dev_del
+};
+
+static struct virtio_device_id id_table[] = {
+	{ VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
+	{ 0 },
+};
+
+static struct vdpa_mgmt_dev mgmt_dev = {
+	.device = &vdpasim_net_mgmtdev,
+	.id_table = id_table,
+	.ops = &vdpasim_net_mgmtdev_ops,
+};
+
+static int vdpasim_net_mgmtdev_init(void)
+{
+	int ret;
+
+	ret = device_register(&vdpasim_net_mgmtdev);
+	if (ret)
+		return ret;
+
+	ret = vdpa_mgmtdev_register(&mgmt_dev);
+	if (ret)
+		goto parent_err;
+	return 0;
+
+parent_err:
+	device_unregister(&vdpasim_net_mgmtdev);
+	return ret;
+}
+
+static void vdpasim_net_mgmtdev_cleanup(void)
+{
+	vdpa_mgmtdev_unregister(&mgmt_dev);
+	device_unregister(&vdpasim_net_mgmtdev);
+}
+
 static int __init vdpasim_net_init(void)
 {
 	int ret = 0;
@@ -176,6 +264,8 @@ static int __init vdpasim_net_init(void)
 
 	if (default_device)
 		ret = vdpasim_net_default_dev_register();
+	else
+		ret = vdpasim_net_mgmtdev_init();
 	return ret;
 }
 
@@ -183,6 +273,8 @@ static void __exit vdpasim_net_exit(void)
 {
 	if (default_device)
 		vdpasim_net_default_dev_unregister();
+	else
+		vdpasim_net_mgmtdev_cleanup();
 }
 
 module_init(vdpasim_net_init);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v2 1/7] vdpa_sim_net: Make mac address array static
  2021-01-04  3:31   ` [PATCH linux-next v2 1/7] vdpa_sim_net: Make mac address array static Parav Pandit
@ 2021-01-04  7:00     ` Jason Wang
  0 siblings, 0 replies; 79+ messages in thread
From: Jason Wang @ 2021-01-04  7:00 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: mst, elic, netdev


On 2021/1/4 上午11:31, Parav Pandit wrote:
> MAC address array is used only in vdpa_sim_net.c.
> Hence, keep it static.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> Changelog:
> v1->v2:
>   - new patch
> ---
>   drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> index c10b6981fdab..f0482427186b 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> @@ -33,7 +33,7 @@ static char *macaddr;
>   module_param(macaddr, charp, 0);
>   MODULE_PARM_DESC(macaddr, "Ethernet MAC address");
>   
> -u8 macaddr_buf[ETH_ALEN];
> +static u8 macaddr_buf[ETH_ALEN];
>   
>   static struct vdpasim *vdpasim_net_dev;
>   


Acked-by: Jason Wang <jasowang@redhat.com>



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v2 4/7] vdpa: Define vdpa mgmt device, ops and a netlink interface
  2021-01-04  3:31   ` [PATCH linux-next v2 4/7] vdpa: Define vdpa mgmt device, ops and a netlink interface Parav Pandit
@ 2021-01-04  7:03     ` Jason Wang
  2021-01-04  7:24       ` Parav Pandit
  0 siblings, 1 reply; 79+ messages in thread
From: Jason Wang @ 2021-01-04  7:03 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: mst, elic, netdev


On 2021/1/4 上午11:31, Parav Pandit wrote:
> To add one or more VDPA devices, define a management device which
> allows adding or removing vdpa device. A management device defines
> set of callbacks to manage vdpa devices.
>
> To begin with, it defines add and remove callbacks through which a user
> defined vdpa device can be added or removed.
>
> A unique management device is identified by its unique handle identified
> by management device name and optionally the bus name.
>
> Hence, introduce routine through which driver can register a
> management device and its callback operations for adding and remove
> a vdpa device.
>
> Introduce vdpa netlink socket family so that user can query management
> device and its attributes.
>
> Example of show vdpa management device which allows creating vdpa device of
> networking class (device id = 0x1) of virtio specification 1.1
> section 5.1.1.
>
> $ vdpa mgmtdev show
> vdpasim_net:
>    supported_classes:
>      net
>
> Example of showing vdpa management device in JSON format.
>
> $ vdpa mgmtdev show -jp
> {
>      "show": {
>          "vdpasim_net": {
>              "supported_classes": [ "net" ]
>          }
>      }
> }
>
> Signed-off-by: Parav Pandit<parav@nvidia.com>
> Reviewed-by: Eli Cohen<elic@nvidia.com>
> Reviewed-by: Jason Wang<jasowang@redhat.com>
> ---
> Changelog:
> v1->v2:
>   - rebased
>   - updated commit log example for management device name from
>     "vdpasim" to "vdpasim_net"
>   - removed device_id as net and block management devices are separated


So I wonder whether there could be a type of management devices that can 
deal with multiple types of virtio devices. If yes, we probably need to 
add device id back.

Thanks



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v2 7/7] vdpa_sim_net: Add support for user supported devices
  2021-01-04  3:31   ` [PATCH linux-next v2 7/7] vdpa_sim_net: Add support for user supported devices Parav Pandit
@ 2021-01-04  7:05     ` Jason Wang
  2021-01-04  7:21       ` Parav Pandit
  0 siblings, 1 reply; 79+ messages in thread
From: Jason Wang @ 2021-01-04  7:05 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: mst, elic, netdev


On 2021/1/4 上午11:31, Parav Pandit wrote:
>   static int __init vdpasim_net_init(void)
>   {
>   	int ret = 0;
> @@ -176,6 +264,8 @@ static int __init vdpasim_net_init(void)
>   
>   	if (default_device)
>   		ret = vdpasim_net_default_dev_register();
> +	else
> +		ret = vdpasim_net_mgmtdev_init();
>   	return ret;
>   }
>   
> @@ -183,6 +273,8 @@ static void __exit vdpasim_net_exit(void)
>   {
>   	if (default_device)
>   		vdpasim_net_default_dev_unregister();
> +	else
> +		vdpasim_net_mgmtdev_cleanup();
>   }
>   
>   module_init(vdpasim_net_init);
> -- 2.26.2


I wonder what's the value of keeping the default device that is out of 
the control of management API.

Thanks


^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH linux-next v2 7/7] vdpa_sim_net: Add support for user supported devices
  2021-01-04  7:05     ` Jason Wang
@ 2021-01-04  7:21       ` Parav Pandit
  2021-01-05  4:06         ` Jason Wang
  0 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-04  7:21 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: mst, Eli Cohen, netdev



> From: Jason Wang <jasowang@redhat.com>
> Sent: Monday, January 4, 2021 12:35 PM
> 
> On 2021/1/4 上午11:31, Parav Pandit wrote:
> >   static int __init vdpasim_net_init(void)
> >   {
> >   	int ret = 0;
> > @@ -176,6 +264,8 @@ static int __init vdpasim_net_init(void)
> >
> >   	if (default_device)
> >   		ret = vdpasim_net_default_dev_register();
> > +	else
> > +		ret = vdpasim_net_mgmtdev_init();
> >   	return ret;
> >   }
> >
> > @@ -183,6 +273,8 @@ static void __exit vdpasim_net_exit(void)
> >   {
> >   	if (default_device)
> >   		vdpasim_net_default_dev_unregister();
> > +	else
> > +		vdpasim_net_mgmtdev_cleanup();
> >   }
> >
> >   module_init(vdpasim_net_init);
> > -- 2.26.2
> 
> 
> I wonder what's the value of keeping the default device that is out of the
> control of management API.

I think we can remove it like how I did in the v1 version. And actual vendor drivers like mlx5_vdpa will likely should do only user created devices.
I added only for backward compatibility purpose, but we can remove the default simulated vdpa net device.
What do you recommend?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH linux-next v2 4/7] vdpa: Define vdpa mgmt device, ops and a netlink interface
  2021-01-04  7:03     ` Jason Wang
@ 2021-01-04  7:24       ` Parav Pandit
  2021-01-05  4:10         ` Jason Wang
  0 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-04  7:24 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: mst, Eli Cohen, netdev



> From: Jason Wang <jasowang@redhat.com>
> Sent: Monday, January 4, 2021 12:33 PM
> 
> On 2021/1/4 上午11:31, Parav Pandit wrote:
> > To add one or more VDPA devices, define a management device which
> > allows adding or removing vdpa device. A management device defines set
> > of callbacks to manage vdpa devices.
> >
> > To begin with, it defines add and remove callbacks through which a
> > user defined vdpa device can be added or removed.
> >
> > A unique management device is identified by its unique handle
> > identified by management device name and optionally the bus name.
> >
> > Hence, introduce routine through which driver can register a
> > management device and its callback operations for adding and remove a
> > vdpa device.
> >
> > Introduce vdpa netlink socket family so that user can query management
> > device and its attributes.
> >
> > Example of show vdpa management device which allows creating vdpa
> > device of networking class (device id = 0x1) of virtio specification
> > 1.1 section 5.1.1.
> >
> > $ vdpa mgmtdev show
> > vdpasim_net:
> >    supported_classes:
> >      net
> >
> > Example of showing vdpa management device in JSON format.
> >
> > $ vdpa mgmtdev show -jp
> > {
> >      "show": {
> >          "vdpasim_net": {
> >              "supported_classes": [ "net" ]
> >          }
> >      }
> > }
> >
> > Signed-off-by: Parav Pandit<parav@nvidia.com>
> > Reviewed-by: Eli Cohen<elic@nvidia.com>
> > Reviewed-by: Jason Wang<jasowang@redhat.com>
> > ---
> > Changelog:
> > v1->v2:
> >   - rebased
> >   - updated commit log example for management device name from
> >     "vdpasim" to "vdpasim_net"
> >   - removed device_id as net and block management devices are
> > separated
> 
> 
> So I wonder whether there could be a type of management devices that can
> deal with multiple types of virtio devices. If yes, we probably need to add
> device id back.
At this point mlx5 plan to support only net.
It is useful to see what type of vdpa device is supported by a management device.

In future if a mgmt dev supports multiple types, user needs to choose desired type.
I guess we can differ this optional type to future, when such mgmt. device will/may be available.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v2 7/7] vdpa_sim_net: Add support for user supported devices
  2021-01-04  7:21       ` Parav Pandit
@ 2021-01-05  4:06         ` Jason Wang
  2021-01-05  6:22           ` Parav Pandit
  0 siblings, 1 reply; 79+ messages in thread
From: Jason Wang @ 2021-01-05  4:06 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: mst, Eli Cohen, netdev


On 2021/1/4 下午3:21, Parav Pandit wrote:
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Monday, January 4, 2021 12:35 PM
>>
>> On 2021/1/4 上午11:31, Parav Pandit wrote:
>>>    static int __init vdpasim_net_init(void)
>>>    {
>>>    	int ret = 0;
>>> @@ -176,6 +264,8 @@ static int __init vdpasim_net_init(void)
>>>
>>>    	if (default_device)
>>>    		ret = vdpasim_net_default_dev_register();
>>> +	else
>>> +		ret = vdpasim_net_mgmtdev_init();
>>>    	return ret;
>>>    }
>>>
>>> @@ -183,6 +273,8 @@ static void __exit vdpasim_net_exit(void)
>>>    {
>>>    	if (default_device)
>>>    		vdpasim_net_default_dev_unregister();
>>> +	else
>>> +		vdpasim_net_mgmtdev_cleanup();
>>>    }
>>>
>>>    module_init(vdpasim_net_init);
>>> -- 2.26.2
>>
>> I wonder what's the value of keeping the default device that is out of the
>> control of management API.
> I think we can remove it like how I did in the v1 version. And actual vendor drivers like mlx5_vdpa will likely should do only user created devices.
> I added only for backward compatibility purpose, but we can remove the default simulated vdpa net device.
> What do you recommend?


I think we'd better mandate this management API. This can avoid vendor 
specific configuration that may complex management layer.

Thanks



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v2 4/7] vdpa: Define vdpa mgmt device, ops and a netlink interface
  2021-01-04  7:24       ` Parav Pandit
@ 2021-01-05  4:10         ` Jason Wang
  2021-01-05  6:33           ` Parav Pandit
  0 siblings, 1 reply; 79+ messages in thread
From: Jason Wang @ 2021-01-05  4:10 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: mst, Eli Cohen, netdev


On 2021/1/4 下午3:24, Parav Pandit wrote:
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Monday, January 4, 2021 12:33 PM
>>
>> On 2021/1/4 上午11:31, Parav Pandit wrote:
>>> To add one or more VDPA devices, define a management device which
>>> allows adding or removing vdpa device. A management device defines set
>>> of callbacks to manage vdpa devices.
>>>
>>> To begin with, it defines add and remove callbacks through which a
>>> user defined vdpa device can be added or removed.
>>>
>>> A unique management device is identified by its unique handle
>>> identified by management device name and optionally the bus name.
>>>
>>> Hence, introduce routine through which driver can register a
>>> management device and its callback operations for adding and remove a
>>> vdpa device.
>>>
>>> Introduce vdpa netlink socket family so that user can query management
>>> device and its attributes.
>>>
>>> Example of show vdpa management device which allows creating vdpa
>>> device of networking class (device id = 0x1) of virtio specification
>>> 1.1 section 5.1.1.
>>>
>>> $ vdpa mgmtdev show
>>> vdpasim_net:
>>>     supported_classes:
>>>       net
>>>
>>> Example of showing vdpa management device in JSON format.
>>>
>>> $ vdpa mgmtdev show -jp
>>> {
>>>       "show": {
>>>           "vdpasim_net": {
>>>               "supported_classes": [ "net" ]
>>>           }
>>>       }
>>> }
>>>
>>> Signed-off-by: Parav Pandit<parav@nvidia.com>
>>> Reviewed-by: Eli Cohen<elic@nvidia.com>
>>> Reviewed-by: Jason Wang<jasowang@redhat.com>
>>> ---
>>> Changelog:
>>> v1->v2:
>>>    - rebased
>>>    - updated commit log example for management device name from
>>>      "vdpasim" to "vdpasim_net"
>>>    - removed device_id as net and block management devices are
>>> separated
>>
>> So I wonder whether there could be a type of management devices that can
>> deal with multiple types of virtio devices. If yes, we probably need to add
>> device id back.
> At this point mlx5 plan to support only net.
> It is useful to see what type of vdpa device is supported by a management device.
>
> In future if a mgmt dev supports multiple types, user needs to choose desired type.
> I guess we can differ this optional type to future, when such mgmt. device will/may be available.


I worry if we remove device_id, it may gives a hint that multiple mgmt 
devices needs to be registered if it supports multiple types.

So if possible I would like to keep the device_id here.

Thanks



^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH linux-next v2 7/7] vdpa_sim_net: Add support for user supported devices
  2021-01-05  4:06         ` Jason Wang
@ 2021-01-05  6:22           ` Parav Pandit
  0 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-05  6:22 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: mst, Eli Cohen, netdev



> From: Jason Wang <jasowang@redhat.com>
> Sent: Tuesday, January 5, 2021 9:36 AM
> 
> 
> On 2021/1/4 下午3:21, Parav Pandit wrote:
> >
> >> From: Jason Wang <jasowang@redhat.com>
> >> Sent: Monday, January 4, 2021 12:35 PM
> >>
> >> On 2021/1/4 上午11:31, Parav Pandit wrote:
> >>>    static int __init vdpasim_net_init(void)
> >>>    {
> >>>    	int ret = 0;
> >>> @@ -176,6 +264,8 @@ static int __init vdpasim_net_init(void)
> >>>
> >>>    	if (default_device)
> >>>    		ret = vdpasim_net_default_dev_register();
> >>> +	else
> >>> +		ret = vdpasim_net_mgmtdev_init();
> >>>    	return ret;
> >>>    }
> >>>
> >>> @@ -183,6 +273,8 @@ static void __exit vdpasim_net_exit(void)
> >>>    {
> >>>    	if (default_device)
> >>>    		vdpasim_net_default_dev_unregister();
> >>> +	else
> >>> +		vdpasim_net_mgmtdev_cleanup();
> >>>    }
> >>>
> >>>    module_init(vdpasim_net_init);
> >>> -- 2.26.2
> >>
> >> I wonder what's the value of keeping the default device that is out
> >> of the control of management API.
> > I think we can remove it like how I did in the v1 version. And actual vendor
> drivers like mlx5_vdpa will likely should do only user created devices.
> > I added only for backward compatibility purpose, but we can remove the
> default simulated vdpa net device.
> > What do you recommend?
> 
> 
> I think we'd better mandate this management API. This can avoid vendor
> specific configuration that may complex management layer.
> 
Sounds good.
I will drop the patch that allows vdpasim_net default device via module parameter. Will post v3 with that removal.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH linux-next v2 4/7] vdpa: Define vdpa mgmt device, ops and a netlink interface
  2021-01-05  4:10         ` Jason Wang
@ 2021-01-05  6:33           ` Parav Pandit
  2021-01-05  8:36             ` Jason Wang
  0 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-05  6:33 UTC (permalink / raw)
  To: Jason Wang, virtualization; +Cc: mst, Eli Cohen, netdev



> From: Jason Wang <jasowang@redhat.com>
> Sent: Tuesday, January 5, 2021 9:40 AM
> 
> On 2021/1/4 下午3:24, Parav Pandit wrote:
> >
> >> From: Jason Wang <jasowang@redhat.com>
> >> Sent: Monday, January 4, 2021 12:33 PM
> >>
> >> On 2021/1/4 上午11:31, Parav Pandit wrote:
> >>> To add one or more VDPA devices, define a management device which
> >>> allows adding or removing vdpa device. A management device defines
> >>> set of callbacks to manage vdpa devices.
> >>>
> >>> To begin with, it defines add and remove callbacks through which a
> >>> user defined vdpa device can be added or removed.
> >>>
> >>> A unique management device is identified by its unique handle
> >>> identified by management device name and optionally the bus name.
> >>>
> >>> Hence, introduce routine through which driver can register a
> >>> management device and its callback operations for adding and remove
> >>> a vdpa device.
> >>>
> >>> Introduce vdpa netlink socket family so that user can query
> >>> management device and its attributes.
> >>>
> >>> Example of show vdpa management device which allows creating vdpa
> >>> device of networking class (device id = 0x1) of virtio specification
> >>> 1.1 section 5.1.1.
> >>>
> >>> $ vdpa mgmtdev show
> >>> vdpasim_net:
> >>>     supported_classes:
> >>>       net
> >>>
> >>> Example of showing vdpa management device in JSON format.
> >>>
> >>> $ vdpa mgmtdev show -jp
> >>> {
> >>>       "show": {
> >>>           "vdpasim_net": {
> >>>               "supported_classes": [ "net" ]
> >>>           }
> >>>       }
> >>> }
> >>>
> >>> Signed-off-by: Parav Pandit<parav@nvidia.com>
> >>> Reviewed-by: Eli Cohen<elic@nvidia.com>
> >>> Reviewed-by: Jason Wang<jasowang@redhat.com>
> >>> ---
> >>> Changelog:
> >>> v1->v2:
> >>>    - rebased
> >>>    - updated commit log example for management device name from
> >>>      "vdpasim" to "vdpasim_net"
> >>>    - removed device_id as net and block management devices are
> >>> separated
> >>
> >> So I wonder whether there could be a type of management devices that
> >> can deal with multiple types of virtio devices. If yes, we probably
> >> need to add device id back.
> > At this point mlx5 plan to support only net.
> > It is useful to see what type of vdpa device is supported by a management
> device.
> >
> > In future if a mgmt dev supports multiple types, user needs to choose
> desired type.
> > I guess we can differ this optional type to future, when such mgmt. device
> will/may be available.
> 
> 
> I worry if we remove device_id, it may gives a hint that multiple mgmt
> devices needs to be registered if it supports multiple types.
> 
No it shouldn't. because we do expose multiple supported types in mgmtdev attributes.

> So if possible I would like to keep the device_id here.
> 
Its possible to keep it. But with current drivers, mainly mlx5 and vdpa_sim, it is redundant.
Not sure of the ifc's plan.
We have been splitting modules to handle net and block differently in mlx5 as well as vdpa_sim.
So it looks to me that both may be separate management drivers (and management devices).
Such as vdpasim_net and vdpasim_block.
mlx5 doesn't have plan for block yet.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v2 4/7] vdpa: Define vdpa mgmt device, ops and a netlink interface
  2021-01-05  6:33           ` Parav Pandit
@ 2021-01-05  8:36             ` Jason Wang
  0 siblings, 0 replies; 79+ messages in thread
From: Jason Wang @ 2021-01-05  8:36 UTC (permalink / raw)
  To: Parav Pandit, virtualization; +Cc: mst, Eli Cohen, netdev


On 2021/1/5 下午2:33, Parav Pandit wrote:
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Tuesday, January 5, 2021 9:40 AM
>>
>> On 2021/1/4 下午3:24, Parav Pandit wrote:
>>>> From: Jason Wang <jasowang@redhat.com>
>>>> Sent: Monday, January 4, 2021 12:33 PM
>>>>
>>>> On 2021/1/4 上午11:31, Parav Pandit wrote:
>>>>> To add one or more VDPA devices, define a management device which
>>>>> allows adding or removing vdpa device. A management device defines
>>>>> set of callbacks to manage vdpa devices.
>>>>>
>>>>> To begin with, it defines add and remove callbacks through which a
>>>>> user defined vdpa device can be added or removed.
>>>>>
>>>>> A unique management device is identified by its unique handle
>>>>> identified by management device name and optionally the bus name.
>>>>>
>>>>> Hence, introduce routine through which driver can register a
>>>>> management device and its callback operations for adding and remove
>>>>> a vdpa device.
>>>>>
>>>>> Introduce vdpa netlink socket family so that user can query
>>>>> management device and its attributes.
>>>>>
>>>>> Example of show vdpa management device which allows creating vdpa
>>>>> device of networking class (device id = 0x1) of virtio specification
>>>>> 1.1 section 5.1.1.
>>>>>
>>>>> $ vdpa mgmtdev show
>>>>> vdpasim_net:
>>>>>      supported_classes:
>>>>>        net
>>>>>
>>>>> Example of showing vdpa management device in JSON format.
>>>>>
>>>>> $ vdpa mgmtdev show -jp
>>>>> {
>>>>>        "show": {
>>>>>            "vdpasim_net": {
>>>>>                "supported_classes": [ "net" ]
>>>>>            }
>>>>>        }
>>>>> }
>>>>>
>>>>> Signed-off-by: Parav Pandit<parav@nvidia.com>
>>>>> Reviewed-by: Eli Cohen<elic@nvidia.com>
>>>>> Reviewed-by: Jason Wang<jasowang@redhat.com>
>>>>> ---
>>>>> Changelog:
>>>>> v1->v2:
>>>>>     - rebased
>>>>>     - updated commit log example for management device name from
>>>>>       "vdpasim" to "vdpasim_net"
>>>>>     - removed device_id as net and block management devices are
>>>>> separated
>>>> So I wonder whether there could be a type of management devices that
>>>> can deal with multiple types of virtio devices. If yes, we probably
>>>> need to add device id back.
>>> At this point mlx5 plan to support only net.
>>> It is useful to see what type of vdpa device is supported by a management
>> device.
>>> In future if a mgmt dev supports multiple types, user needs to choose
>> desired type.
>>> I guess we can differ this optional type to future, when such mgmt. device
>> will/may be available.
>>
>>
>> I worry if we remove device_id, it may gives a hint that multiple mgmt
>> devices needs to be registered if it supports multiple types.
>>
> No it shouldn't. because we do expose multiple supported types in mgmtdev attributes.


Right.


>
>> So if possible I would like to keep the device_id here.
>>
> Its possible to keep it. But with current drivers, mainly mlx5 and vdpa_sim, it is redundant.
> Not sure of the ifc's plan.
> We have been splitting modules to handle net and block differently in mlx5 as well as vdpa_sim.
> So it looks to me that both may be separate management drivers (and management devices).
> Such as vdpasim_net and vdpasim_block.
> mlx5 doesn't have plan for block yet.


Ok. Then it's fine.

Thanks



^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v3 0/6] Introduce vdpa management tool
  2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
                   ` (11 preceding siblings ...)
  2021-01-04  3:31 ` [PATCH linux-next v2 " Parav Pandit
@ 2021-01-05 10:31 ` Parav Pandit
  2021-01-05 10:31   ` [PATCH linux-next v3 1/6] vdpa_sim_net: Make mac address array static Parav Pandit
                     ` (5 more replies)
  12 siblings, 6 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-05 10:31 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

This patchset covers user requirements for managing existing vdpa devices,
using a tool and its internal design notes for kernel drivers.

Background and user requirements:
----------------------------------
(1) Currently VDPA device is created by driver when driver is loaded.
However, user should have a choice when to create or not create a vdpa
device for the underlying management device.

For example, mlx5 PCI VF and subfunction device supports multiple classes of
device such netdev, vdpa, rdma. Howevever it is not required to always
created vdpa device for such device.

(2) In another use case, a device may support creating one or multiple vdpa
device of same or different class such as net and block.
Creating vdpa devices at driver load time further limits this use case.

(3) A user should be able to monitor and query vdpa queue level or device
level statistics for a given vdpa device.

(4) A user should be able to query what class of vdpa devices are supported
by its management device.

(5) A user should be able to view supported features and negotiated
features of the vdpa device.

(6) A user should be able to create a vdpa device in vendor agnostic manner
using single tool.

Hence, it is required to have a tool through which user can create one or
more vdpa devices from a management device which addresses above user
requirements.

Example devices:
----------------
 +-----------+ +-----------+ +---------+ +--------+ +-----------+ 
 |vdpa dev 0 | |vdpa dev 1 | |rdma dev | |netdev  | |vdpa dev 3 |
 |type=net   | |type=net   | |mlx5_0   | |ens3f0  | |type=net   |
 +----+------+ +-----+-----+ +----+----+ +-----+--+ +----+------+
      |              |            |            |         |
      |              |            |            |         |
 +----+-----+        |       +----+----+       |    +----+----+
 |  mlx5    +--------+       |mlx5     +-------+    |mlx5     |
 |pci vf 2  |                |pci vf 4 |            |pci sf 8 |
 |03:00:2   |                |03:00.4  |            |mlx5_sf.8|
 +----+-----+                +----+----+            +----+----+
      |                           |                      |
      |                      +----+-----+                |
      +----------------------+mlx5      +----------------+
                             |pci pf 0  |
                             |03:00.0   |
                             +----------+

vdpa tool:
----------
vdpa tool is a tool to create, delete vdpa devices from a management
device. It is a tool that enables user to query statistics, features
and may be more attributes in future.

vdpa tool command draft:
------------------------
(a) List management devices which support creating vdpa devices.
It also shows which class types supported by this management device.
In below command example four management devices support vdpa device
creation.

First is simulated vdpasim_net management device.
Second is PCI VF whose bdf is 03.00:2.
Third is PCI VF whose name is 03:00.4.
Forth is PCI SF whose name is mlx5_core.sf.8

$ vdpa mgmtdev list
vdpasim_net
  supported_classes
    net
pci/0000:03.00:0
  supported_classes
    net
pci/0000:03.00:4
  supported_classes
    net
auxiliary/mlx5_core.sf.8
  supported_classes
    net

(b) Now add a vdpa device of networking class and show the device.
$ vdpa dev add mgmtdev pci/0000:03.00:0 name foo0

$ vdpa dev show foo0
foo0: mgmtdev pci/0000:03.00:2 type network vendor_id 0 max_vqs 2 max_vq_size 256

(c) Show features of a vdpa device
$ vdpa dev features show foo0
supported
  iommu platform
  version 1

(d) Dump vdpa device statistics
$ vdpa dev stats show foo0
kickdoorbells 10
wqes 100

(e) Now delete a vdpa device previously created.
$ vdpa dev del foo0

vdpa tool support in this patchset:
-----------------------------------
vdpa tool is created to create, delete and query vdpa devices.
examples:
Show vdpa management device that supports creating, deleting vdpa devices.

$ vdpa mgmtdev show
vdpasim_net:
  supported_classes
    net

$ vdpa mgmtdev show -jp
{
    "show": {
       "vdpasim_net": {
          "supported_classes": {
             "net"
        }
    }
}

Create a vdpa device of type networking named as "foo2" from the
management device vdpasim_net:

$ vdpa dev add mgmtdev vdpasim_net name foo2

Show the newly created vdpa device by its name:
$ vdpa dev show foo2
foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2 max_vq_size 256

$ vdpa dev show foo2 -jp
{
    "dev": {
        "foo2": {
            "type": "network",
            "mgmtdev": "vdpasim_net",
            "vendor_id": 0,
            "max_vqs": 2,
            "max_vq_size": 256
        }
    }
}

Delete the vdpa device after its use:
$ vdpa dev del foo2

vdpa tool support by kernel:
----------------------------
vdpa tool user interface is supported by existing vdpa kernel framework,
i.e. drivers/vdpa/vdpa.c It services user command through a netlink interface.

Each management device registers supported callback operations with vdpa
subsystem through which vdpa device(s) can be managed.

Patch summary:
--------------
Patch-1 Makes mac address array static
Patch-2 Extends API to accept vdpa device name
Patch-3 Defines management device interface
Patch-4 Extends netlink interface to add, delete vdpa devices
Patch-5 Extends netlink interface to query vdpa device attributes
Patch-6 Extends vdpa_sim_net driver to add/delete simulated vdpa devices

Changelog:
----------
v2->v3:
 - removed default device module param patch
 - removed code branches due to removal of default device module param
   patch
 - removed two merged patches from v1
 - added patch to make mac address static
v1->v2:
 - rebased
 - moved code from vdpasim to vdpa_sim_net module as code is split
   between two modules
 - removed device_id field during device create as its not used
   currently
 - updated examples in commit log for management device name and
   device_id removal
 - changed parentdev to mgmtdev as tool reflects management
   functionality

FAQs:
-----
1. Where does userspace vdpa tool reside which users can use?
Ans: vdpa tool can possibly reside in iproute2 [1] as it enables user to
handler vdpa network devices.

2. Why not create and delete vdpa device using sysfs/configfs?
Ans:
(a) A device creation may involve passing one or more attributes.
Passing multiple attributes and returning error code and more verbose
information for invalid attributes cannot be handled by sysfs/configfs.

(b) netlink framework is rich that enables user space and kernel driver to
provide nested attributes.

(c) Exposing device specific file under sysfs without net namespace
awareness exposes details to multiple containers. Instead exposing
attributes via a netlink socket secures the communication channel with kernel.

(d) netlink socket interface enables to run syscaller kernel tests.

3. Why not use ioctl() interface?
Ans: ioctl() interface replicates the necessary plumbing which already
exists through netlink socket.

4. What happens when one or more user created vdpa devices exist for a
management PCI VF or SF and such management device is removed?
Ans: All user created vdpa devices are removed that belong to a
management device.

[1] git://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git

Next steps:
-----------
(a) Post this patchset and iproute2/vdpa inclusion, remaining two drivers
will be coverted to support vdpa tool instead of creating unmanaged default
device on driver load.
(b) More net specific parameters such as mac, mtu will be added.
(c) Features bits get and set interface will be added.

Parav Pandit (6):
  vdpa_sim_net: Make mac address array static
  vdpa: Extend routine to accept vdpa device name
  vdpa: Define vdpa mgmt device, ops and a netlink interface
  vdpa: Enable a user to add and delete a vdpa device
  vdpa: Enable user to query vdpa device info
  vdpa_sim_net: Add support for user supported devices

 drivers/vdpa/Kconfig                 |   1 +
 drivers/vdpa/ifcvf/ifcvf_main.c      |   2 +-
 drivers/vdpa/mlx5/net/mlx5_vnet.c    |   2 +-
 drivers/vdpa/vdpa.c                  | 503 ++++++++++++++++++++++++++-
 drivers/vdpa/vdpa_sim/vdpa_sim.c     |   3 +-
 drivers/vdpa/vdpa_sim/vdpa_sim.h     |   2 +
 drivers/vdpa/vdpa_sim/vdpa_sim_net.c |  98 ++++--
 include/linux/vdpa.h                 |  44 ++-
 include/uapi/linux/vdpa.h            |  40 +++
 9 files changed, 657 insertions(+), 38 deletions(-)
 create mode 100644 include/uapi/linux/vdpa.h

-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v3 1/6] vdpa_sim_net: Make mac address array static
  2021-01-05 10:31 ` [PATCH linux-next v3 0/6] Introduce vdpa management tool Parav Pandit
@ 2021-01-05 10:31   ` Parav Pandit
  2021-01-07 13:45     ` Stefano Garzarella
  2021-01-05 10:31   ` [PATCH linux-next v3 2/6] vdpa: Extend routine to accept vdpa device name Parav Pandit
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-05 10:31 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

MAC address array is used only in vdpa_sim_net.c.
Hence, keep it static.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
Changelog:
v1->v2:
 - new patch
---
 drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
index c10b6981fdab..f0482427186b 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -33,7 +33,7 @@ static char *macaddr;
 module_param(macaddr, charp, 0);
 MODULE_PARM_DESC(macaddr, "Ethernet MAC address");
 
-u8 macaddr_buf[ETH_ALEN];
+static u8 macaddr_buf[ETH_ALEN];
 
 static struct vdpasim *vdpasim_net_dev;
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v3 2/6] vdpa: Extend routine to accept vdpa device name
  2021-01-05 10:31 ` [PATCH linux-next v3 0/6] Introduce vdpa management tool Parav Pandit
  2021-01-05 10:31   ` [PATCH linux-next v3 1/6] vdpa_sim_net: Make mac address array static Parav Pandit
@ 2021-01-05 10:31   ` Parav Pandit
  2021-01-05 10:32   ` [PATCH linux-next v3 3/6] vdpa: Define vdpa mgmt device, ops and a netlink interface Parav Pandit
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-05 10:31 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

In a subsequent patch, when user initiated command creates a vdpa device,
the user chooses the name of the vdpa device.
To support it, extend the device allocation API to consider this name
specified by the caller driver.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
Changelog:
v1->v2:
 - rebased
---
 drivers/vdpa/ifcvf/ifcvf_main.c   |  2 +-
 drivers/vdpa/mlx5/net/mlx5_vnet.c |  2 +-
 drivers/vdpa/vdpa.c               | 36 +++++++++++++++++++++++++++----
 drivers/vdpa/vdpa_sim/vdpa_sim.c  |  2 +-
 include/linux/vdpa.h              |  7 +++---
 5 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index fa1af301cf55..7c8bbfcf6c3e 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -432,7 +432,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
 				    dev, &ifc_vdpa_ops,
-				    IFCVF_MAX_QUEUE_PAIRS * 2);
+				    IFCVF_MAX_QUEUE_PAIRS * 2, NULL);
 	if (adapter == NULL) {
 		IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
 		return -ENOMEM;
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 81b932f72e10..5920290521cf 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1946,7 +1946,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
 	max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);
 
 	ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
-				 2 * mlx5_vdpa_max_qps(max_vqs));
+				 2 * mlx5_vdpa_max_qps(max_vqs), NULL);
 	if (IS_ERR(ndev))
 		return ndev;
 
diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index c0825650c055..7414bbd9057c 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -12,6 +12,8 @@
 #include <linux/slab.h>
 #include <linux/vdpa.h>
 
+/* A global mutex that protects vdpa management device and device level operations. */
+static DEFINE_MUTEX(vdpa_dev_mutex);
 static DEFINE_IDA(vdpa_index_ida);
 
 static int vdpa_dev_probe(struct device *d)
@@ -63,6 +65,7 @@ static void vdpa_release_dev(struct device *d)
  * @config: the bus operations that is supported by this device
  * @nvqs: number of virtqueues supported by this device
  * @size: size of the parent structure that contains private data
+ * @name: name of the vdpa device; optional.
  *
  * Driver should use vdpa_alloc_device() wrapper macro instead of
  * using this directly.
@@ -72,8 +75,7 @@ static void vdpa_release_dev(struct device *d)
  */
 struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 					const struct vdpa_config_ops *config,
-					int nvqs,
-					size_t size)
+					int nvqs, size_t size, const char *name)
 {
 	struct vdpa_device *vdev;
 	int err = -EINVAL;
@@ -101,7 +103,10 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 	vdev->features_valid = false;
 	vdev->nvqs = nvqs;
 
-	err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
+	if (name)
+		err = dev_set_name(&vdev->dev, "%s", name);
+	else
+		err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
 	if (err)
 		goto err_name;
 
@@ -118,6 +123,13 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 }
 EXPORT_SYMBOL_GPL(__vdpa_alloc_device);
 
+static int vdpa_name_match(struct device *dev, const void *data)
+{
+	struct vdpa_device *vdev = container_of(dev, struct vdpa_device, dev);
+
+	return (strcmp(dev_name(&vdev->dev), data) == 0);
+}
+
 /**
  * vdpa_register_device - register a vDPA device
  * Callers must have a succeed call of vdpa_alloc_device() before.
@@ -127,7 +139,21 @@ EXPORT_SYMBOL_GPL(__vdpa_alloc_device);
  */
 int vdpa_register_device(struct vdpa_device *vdev)
 {
-	return device_add(&vdev->dev);
+	struct device *dev;
+	int err;
+
+	mutex_lock(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, dev_name(&vdev->dev), vdpa_name_match);
+	if (dev) {
+		put_device(dev);
+		err = -EEXIST;
+		goto name_err;
+	}
+
+	err = device_add(&vdev->dev);
+name_err:
+	mutex_unlock(&vdpa_dev_mutex);
+	return err;
 }
 EXPORT_SYMBOL_GPL(vdpa_register_device);
 
@@ -137,7 +163,9 @@ EXPORT_SYMBOL_GPL(vdpa_register_device);
  */
 void vdpa_unregister_device(struct vdpa_device *vdev)
 {
+	mutex_lock(&vdpa_dev_mutex);
 	device_unregister(&vdev->dev);
+	mutex_unlock(&vdpa_dev_mutex);
 }
 EXPORT_SYMBOL_GPL(vdpa_unregister_device);
 
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index b3fcc67bfdf0..db1636a99ba4 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -235,7 +235,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
 		ops = &vdpasim_config_ops;
 
 	vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
-				    dev_attr->nvqs);
+				    dev_attr->nvqs, NULL);
 	if (!vdpasim)
 		goto err_alloc;
 
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 0fefeb976877..5700baa22356 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -245,15 +245,14 @@ struct vdpa_config_ops {
 
 struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 					const struct vdpa_config_ops *config,
-					int nvqs,
-					size_t size);
+					int nvqs, size_t size, const char *name);
 
-#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs)   \
+#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, name)   \
 			  container_of(__vdpa_alloc_device( \
 				       parent, config, nvqs, \
 				       sizeof(dev_struct) + \
 				       BUILD_BUG_ON_ZERO(offsetof( \
-				       dev_struct, member))), \
+				       dev_struct, member)), name), \
 				       dev_struct, member)
 
 int vdpa_register_device(struct vdpa_device *vdev);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v3 3/6] vdpa: Define vdpa mgmt device, ops and a netlink interface
  2021-01-05 10:31 ` [PATCH linux-next v3 0/6] Introduce vdpa management tool Parav Pandit
  2021-01-05 10:31   ` [PATCH linux-next v3 1/6] vdpa_sim_net: Make mac address array static Parav Pandit
  2021-01-05 10:31   ` [PATCH linux-next v3 2/6] vdpa: Extend routine to accept vdpa device name Parav Pandit
@ 2021-01-05 10:32   ` Parav Pandit
  2021-01-05 10:32   ` [PATCH linux-next v3 4/6] vdpa: Enable a user to add and delete a vdpa device Parav Pandit
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-05 10:32 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

To add one or more VDPA devices, define a management device which
allows adding or removing vdpa device. A management device defines
set of callbacks to manage vdpa devices.

To begin with, it defines add and remove callbacks through which a user
defined vdpa device can be added or removed.

A unique management device is identified by its unique handle identified
by management device name and optionally the bus name.

Hence, introduce routine through which driver can register a
management device and its callback operations for adding and remove
a vdpa device.

Introduce vdpa netlink socket family so that user can query management
device and its attributes.

Example of show vdpa management device which allows creating vdpa device of
networking class (device id = 0x1) of virtio specification 1.1
section 5.1.1.

$ vdpa mgmtdev show
vdpasim_net:
  supported_classes:
    net

Example of showing vdpa management device in JSON format.

$ vdpa mgmtdev show -jp
{
    "show": {
        "vdpasim_net": {
            "supported_classes": [ "net" ]
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
---
Changelog:
v1->v2:
 - rebased
 - updated commit log example for management device name from
   "vdpasim" to "vdpasim_net"
 - removed device_id as net and block management devices are separated
 - dev_add() return type is changed from struct vdpa_device to int
---
 drivers/vdpa/Kconfig      |   1 +
 drivers/vdpa/vdpa.c       | 213 +++++++++++++++++++++++++++++++++++++-
 include/linux/vdpa.h      |  31 ++++++
 include/uapi/linux/vdpa.h |  31 ++++++
 4 files changed, 275 insertions(+), 1 deletion(-)
 create mode 100644 include/uapi/linux/vdpa.h

diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index 92a6396f8a73..ffd1e098bfd2 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 menuconfig VDPA
 	tristate "vDPA drivers"
+	depends on NET
 	help
 	  Enable this module to support vDPA device that uses a
 	  datapath which complies with virtio specifications with
diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 7414bbd9057c..319d09709dfc 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -11,11 +11,17 @@
 #include <linux/idr.h>
 #include <linux/slab.h>
 #include <linux/vdpa.h>
+#include <uapi/linux/vdpa.h>
+#include <net/genetlink.h>
+#include <linux/mod_devicetable.h>
 
+static LIST_HEAD(mdev_head);
 /* A global mutex that protects vdpa management device and device level operations. */
 static DEFINE_MUTEX(vdpa_dev_mutex);
 static DEFINE_IDA(vdpa_index_ida);
 
+static struct genl_family vdpa_nl_family;
+
 static int vdpa_dev_probe(struct device *d)
 {
 	struct vdpa_device *vdev = dev_to_vdpa(d);
@@ -195,13 +201,218 @@ void vdpa_unregister_driver(struct vdpa_driver *drv)
 }
 EXPORT_SYMBOL_GPL(vdpa_unregister_driver);
 
+/**
+ * vdpa_mgmtdev_register - register a vdpa management device
+ *
+ * @mdev: Pointer to vdpa management device
+ * vdpa_mgmtdev_register() register a vdpa management device which supports
+ * vdpa device management.
+ */
+int vdpa_mgmtdev_register(struct vdpa_mgmt_dev *mdev)
+{
+	if (!mdev->device || !mdev->ops || !mdev->ops->dev_add || !mdev->ops->dev_del)
+		return -EINVAL;
+
+	INIT_LIST_HEAD(&mdev->list);
+	mutex_lock(&vdpa_dev_mutex);
+	list_add_tail(&mdev->list, &mdev_head);
+	mutex_unlock(&vdpa_dev_mutex);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vdpa_mgmtdev_register);
+
+void vdpa_mgmtdev_unregister(struct vdpa_mgmt_dev *mdev)
+{
+	mutex_lock(&vdpa_dev_mutex);
+	list_del(&mdev->list);
+	mutex_unlock(&vdpa_dev_mutex);
+}
+EXPORT_SYMBOL_GPL(vdpa_mgmtdev_unregister);
+
+static bool mgmtdev_handle_match(const struct vdpa_mgmt_dev *mdev,
+				 const char *busname, const char *devname)
+{
+	/* Bus name is optional for simulated management device, so ignore the
+	 * device with bus if bus attribute is provided.
+	 */
+	if ((busname && !mdev->device->bus) || (!busname && mdev->device->bus))
+		return false;
+
+	if (!busname && strcmp(dev_name(mdev->device), devname) == 0)
+		return true;
+
+	if (busname && (strcmp(mdev->device->bus->name, busname) == 0) &&
+	    (strcmp(dev_name(mdev->device), devname) == 0))
+		return true;
+
+	return false;
+}
+
+static struct vdpa_mgmt_dev *vdpa_mgmtdev_get_from_attr(struct nlattr **attrs)
+{
+	struct vdpa_mgmt_dev *mdev;
+	const char *busname = NULL;
+	const char *devname;
+
+	if (!attrs[VDPA_ATTR_MGMTDEV_DEV_NAME])
+		return ERR_PTR(-EINVAL);
+	devname = nla_data(attrs[VDPA_ATTR_MGMTDEV_DEV_NAME]);
+	if (attrs[VDPA_ATTR_MGMTDEV_BUS_NAME])
+		busname = nla_data(attrs[VDPA_ATTR_MGMTDEV_BUS_NAME]);
+
+	list_for_each_entry(mdev, &mdev_head, list) {
+		if (mgmtdev_handle_match(mdev, busname, devname))
+			return mdev;
+	}
+	return ERR_PTR(-ENODEV);
+}
+
+static int vdpa_nl_mgmtdev_handle_fill(struct sk_buff *msg, const struct vdpa_mgmt_dev *mdev)
+{
+	if (mdev->device->bus &&
+	    nla_put_string(msg, VDPA_ATTR_MGMTDEV_BUS_NAME, mdev->device->bus->name))
+		return -EMSGSIZE;
+	if (nla_put_string(msg, VDPA_ATTR_MGMTDEV_DEV_NAME, dev_name(mdev->device)))
+		return -EMSGSIZE;
+	return 0;
+}
+
+static int vdpa_mgmtdev_fill(const struct vdpa_mgmt_dev *mdev, struct sk_buff *msg,
+			     u32 portid, u32 seq, int flags)
+{
+	u64 supported_classes = 0;
+	void *hdr;
+	int i = 0;
+	int err;
+
+	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags, VDPA_CMD_MGMTDEV_NEW);
+	if (!hdr)
+		return -EMSGSIZE;
+	err = vdpa_nl_mgmtdev_handle_fill(msg, mdev);
+	if (err)
+		goto msg_err;
+
+	while (mdev->id_table[i].device) {
+		supported_classes |= BIT(mdev->id_table[i].device);
+		i++;
+	}
+
+	if (nla_put_u64_64bit(msg, VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES,
+			      supported_classes, VDPA_ATTR_UNSPEC)) {
+		err = -EMSGSIZE;
+		goto msg_err;
+	}
+
+	genlmsg_end(msg, hdr);
+	return 0;
+
+msg_err:
+	genlmsg_cancel(msg, hdr);
+	return err;
+}
+
+static int vdpa_nl_cmd_mgmtdev_get_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_mgmt_dev *mdev;
+	struct sk_buff *msg;
+	int err;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	mutex_lock(&vdpa_dev_mutex);
+	mdev = vdpa_mgmtdev_get_from_attr(info->attrs);
+	if (IS_ERR(mdev)) {
+		mutex_unlock(&vdpa_dev_mutex);
+		NL_SET_ERR_MSG_MOD(info->extack, "Fail to find the specified mgmt device");
+		err = PTR_ERR(mdev);
+		goto out;
+	}
+
+	err = vdpa_mgmtdev_fill(mdev, msg, info->snd_portid, info->snd_seq, 0);
+	mutex_unlock(&vdpa_dev_mutex);
+	if (err)
+		goto out;
+	err = genlmsg_reply(msg, info);
+	return err;
+
+out:
+	nlmsg_free(msg);
+	return err;
+}
+
+static int
+vdpa_nl_cmd_mgmtdev_get_dumpit(struct sk_buff *msg, struct netlink_callback *cb)
+{
+	struct vdpa_mgmt_dev *mdev;
+	int start = cb->args[0];
+	int idx = 0;
+	int err;
+
+	mutex_lock(&vdpa_dev_mutex);
+	list_for_each_entry(mdev, &mdev_head, list) {
+		if (idx < start) {
+			idx++;
+			continue;
+		}
+		err = vdpa_mgmtdev_fill(mdev, msg, NETLINK_CB(cb->skb).portid,
+					cb->nlh->nlmsg_seq, NLM_F_MULTI);
+		if (err)
+			goto out;
+		idx++;
+	}
+out:
+	mutex_unlock(&vdpa_dev_mutex);
+	cb->args[0] = idx;
+	return msg->len;
+}
+
+static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX] = {
+	[VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type = NLA_NUL_STRING },
+	[VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING },
+};
+
+static const struct genl_ops vdpa_nl_ops[] = {
+	{
+		.cmd = VDPA_CMD_MGMTDEV_GET,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_mgmtdev_get_doit,
+		.dumpit = vdpa_nl_cmd_mgmtdev_get_dumpit,
+	},
+};
+
+static struct genl_family vdpa_nl_family __ro_after_init = {
+	.name = VDPA_GENL_NAME,
+	.version = VDPA_GENL_VERSION,
+	.maxattr = VDPA_ATTR_MAX,
+	.policy = vdpa_nl_policy,
+	.netnsok = false,
+	.module = THIS_MODULE,
+	.ops = vdpa_nl_ops,
+	.n_ops = ARRAY_SIZE(vdpa_nl_ops),
+};
+
 static int vdpa_init(void)
 {
-	return bus_register(&vdpa_bus);
+	int err;
+
+	err = bus_register(&vdpa_bus);
+	if (err)
+		return err;
+	err = genl_register_family(&vdpa_nl_family);
+	if (err)
+		goto err;
+	return 0;
+
+err:
+	bus_unregister(&vdpa_bus);
+	return err;
 }
 
 static void __exit vdpa_exit(void)
 {
+	genl_unregister_family(&vdpa_nl_family);
 	bus_unregister(&vdpa_bus);
 	ida_destroy(&vdpa_index_ida);
 }
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 5700baa22356..6b8b4222bca6 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -35,6 +35,8 @@ struct vdpa_vq_state {
 	u16	avail_index;
 };
 
+struct vdpa_mgmt_dev;
+
 /**
  * vDPA device - representation of a vDPA device
  * @dev: underlying device
@@ -335,4 +337,33 @@ static inline void vdpa_get_config(struct vdpa_device *vdev, unsigned offset,
 	ops->get_config(vdev, offset, buf, len);
 }
 
+/**
+ * vdpa_mgmtdev_ops - vdpa device ops
+ * @dev_add:	Add a vdpa device using alloc and register
+ *		@mdev: parent device to use for device addition
+ *		@name: name of the new vdpa device
+ *		Driver need to add a new device using _vdpa_register_device()
+ *		after fully initializing the vdpa device. Driver must return 0
+ *		on success or appropriate error code.
+ * @dev_del:	Remove a vdpa device using unregister
+ *		@mdev: parent device to use for device removal
+ *		@dev: vdpa device to remove
+ *		Driver need to remove the specified device by calling
+ *		_vdpa_unregister_device().
+ */
+struct vdpa_mgmtdev_ops {
+	int (*dev_add)(struct vdpa_mgmt_dev *mdev, const char *name);
+	void (*dev_del)(struct vdpa_mgmt_dev *mdev, struct vdpa_device *dev);
+};
+
+struct vdpa_mgmt_dev {
+	struct device *device;
+	const struct vdpa_mgmtdev_ops *ops;
+	const struct virtio_device_id *id_table; /* supported ids */
+	struct list_head list;
+};
+
+int vdpa_mgmtdev_register(struct vdpa_mgmt_dev *mdev);
+void vdpa_mgmtdev_unregister(struct vdpa_mgmt_dev *mdev);
+
 #endif /* _LINUX_VDPA_H */
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
new file mode 100644
index 000000000000..d44d82e567b1
--- /dev/null
+++ b/include/uapi/linux/vdpa.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/*
+ * vdpa device management interface
+ * Copyright (c) 2020 Mellanox Technologies Ltd. All rights reserved.
+ */
+
+#ifndef _UAPI_LINUX_VDPA_H_
+#define _UAPI_LINUX_VDPA_H_
+
+#define VDPA_GENL_NAME "vdpa"
+#define VDPA_GENL_VERSION 0x1
+
+enum vdpa_command {
+	VDPA_CMD_UNSPEC,
+	VDPA_CMD_MGMTDEV_NEW,
+	VDPA_CMD_MGMTDEV_GET,		/* can dump */
+};
+
+enum vdpa_attr {
+	VDPA_ATTR_UNSPEC,
+
+	/* bus name (optional) + dev name together make the parent device handle */
+	VDPA_ATTR_MGMTDEV_BUS_NAME,		/* string */
+	VDPA_ATTR_MGMTDEV_DEV_NAME,		/* string */
+	VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES,	/* u64 */
+
+	/* new attributes must be added above here */
+	VDPA_ATTR_MAX,
+};
+
+#endif
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v3 4/6] vdpa: Enable a user to add and delete a vdpa device
  2021-01-05 10:31 ` [PATCH linux-next v3 0/6] Introduce vdpa management tool Parav Pandit
                     ` (2 preceding siblings ...)
  2021-01-05 10:32   ` [PATCH linux-next v3 3/6] vdpa: Define vdpa mgmt device, ops and a netlink interface Parav Pandit
@ 2021-01-05 10:32   ` Parav Pandit
  2021-01-05 10:32   ` [PATCH linux-next v3 5/6] vdpa: Enable user to query vdpa device info Parav Pandit
  2021-01-05 10:32   ` [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices Parav Pandit
  5 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-05 10:32 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

Add the ability to add and delete a vdpa device.

Examples:
Create a vdpa device of type network named "foo2" from
the management device vdpasim:

$ vdpa dev add mgmtdev vdpasim_net name foo2

Delete the vdpa device after its use:
$ vdpa dev del foo2

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>

---
Changelog:
v1->v2:
 - using int return type for dev_add callback
 - removed device_id (type) as current drivers only supports single type
---
 drivers/vdpa/vdpa.c       | 143 +++++++++++++++++++++++++++++++++++---
 include/linux/vdpa.h      |   6 ++
 include/uapi/linux/vdpa.h |   4 ++
 3 files changed, 143 insertions(+), 10 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 319d09709dfc..dca67e4d32e5 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -136,6 +136,37 @@ static int vdpa_name_match(struct device *dev, const void *data)
 	return (strcmp(dev_name(&vdev->dev), data) == 0);
 }
 
+static int __vdpa_register_device(struct vdpa_device *vdev)
+{
+	struct device *dev;
+
+	lockdep_assert_held(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, dev_name(&vdev->dev), vdpa_name_match);
+	if (dev) {
+		put_device(dev);
+		return -EEXIST;
+	}
+	return device_add(&vdev->dev);
+}
+
+/**
+ * _vdpa_register_device - register a vDPA device with vdpa lock held
+ * Caller must have a succeed call of vdpa_alloc_device() before.
+ * Caller must invoke this routine in the management device dev_add()
+ * callback after setting up valid mgmtdev for this vdpa device.
+ * @vdev: the vdpa device to be registered to vDPA bus
+ *
+ * Returns an error when fail to add device to vDPA bus
+ */
+int _vdpa_register_device(struct vdpa_device *vdev)
+{
+	if (!vdev->mdev)
+		return -EINVAL;
+
+	return __vdpa_register_device(vdev);
+}
+EXPORT_SYMBOL_GPL(_vdpa_register_device);
+
 /**
  * vdpa_register_device - register a vDPA device
  * Callers must have a succeed call of vdpa_alloc_device() before.
@@ -145,24 +176,29 @@ static int vdpa_name_match(struct device *dev, const void *data)
  */
 int vdpa_register_device(struct vdpa_device *vdev)
 {
-	struct device *dev;
 	int err;
 
 	mutex_lock(&vdpa_dev_mutex);
-	dev = bus_find_device(&vdpa_bus, NULL, dev_name(&vdev->dev), vdpa_name_match);
-	if (dev) {
-		put_device(dev);
-		err = -EEXIST;
-		goto name_err;
-	}
-
-	err = device_add(&vdev->dev);
-name_err:
+	err = __vdpa_register_device(vdev);
 	mutex_unlock(&vdpa_dev_mutex);
 	return err;
 }
 EXPORT_SYMBOL_GPL(vdpa_register_device);
 
+/**
+ * _vdpa_unregister_device - unregister a vDPA device
+ * Caller must invoke this routine as part of management device dev_del()
+ * callback.
+ * @vdev: the vdpa device to be unregisted from vDPA bus
+ */
+void _vdpa_unregister_device(struct vdpa_device *vdev)
+{
+	lockdep_assert_held(&vdpa_dev_mutex);
+	WARN_ON(!vdev->mdev);
+	device_unregister(&vdev->dev);
+}
+EXPORT_SYMBOL_GPL(_vdpa_unregister_device);
+
 /**
  * vdpa_unregister_device - unregister a vDPA device
  * @vdev: the vdpa device to be unregisted from vDPA bus
@@ -221,10 +257,25 @@ int vdpa_mgmtdev_register(struct vdpa_mgmt_dev *mdev)
 }
 EXPORT_SYMBOL_GPL(vdpa_mgmtdev_register);
 
+static int vdpa_match_remove(struct device *dev, void *data)
+{
+	struct vdpa_device *vdev = container_of(dev, struct vdpa_device, dev);
+	struct vdpa_mgmt_dev *mdev = vdev->mdev;
+
+	if (mdev == data)
+		mdev->ops->dev_del(mdev, vdev);
+	return 0;
+}
+
 void vdpa_mgmtdev_unregister(struct vdpa_mgmt_dev *mdev)
 {
 	mutex_lock(&vdpa_dev_mutex);
+
 	list_del(&mdev->list);
+
+	/* Filter out all the entries belong to this management device and delete it. */
+	bus_for_each_dev(&vdpa_bus, NULL, mdev, vdpa_match_remove);
+
 	mutex_unlock(&vdpa_dev_mutex);
 }
 EXPORT_SYMBOL_GPL(vdpa_mgmtdev_unregister);
@@ -368,9 +419,69 @@ vdpa_nl_cmd_mgmtdev_get_dumpit(struct sk_buff *msg, struct netlink_callback *cb)
 	return msg->len;
 }
 
+static int vdpa_nl_cmd_dev_add_set_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_mgmt_dev *mdev;
+	const char *name;
+	int err = 0;
+
+	if (!info->attrs[VDPA_ATTR_DEV_NAME])
+		return -EINVAL;
+
+	name = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
+
+	mutex_lock(&vdpa_dev_mutex);
+	mdev = vdpa_mgmtdev_get_from_attr(info->attrs);
+	if (IS_ERR(mdev)) {
+		NL_SET_ERR_MSG_MOD(info->extack, "Fail to find the specified management device");
+		err = PTR_ERR(mdev);
+		goto err;
+	}
+
+	err = mdev->ops->dev_add(mdev, name);
+err:
+	mutex_unlock(&vdpa_dev_mutex);
+	return err;
+}
+
+static int vdpa_nl_cmd_dev_del_set_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_mgmt_dev *mdev;
+	struct vdpa_device *vdev;
+	struct device *dev;
+	const char *name;
+	int err = 0;
+
+	if (!info->attrs[VDPA_ATTR_DEV_NAME])
+		return -EINVAL;
+	name = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
+
+	mutex_lock(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, name, vdpa_name_match);
+	if (!dev) {
+		NL_SET_ERR_MSG_MOD(info->extack, "device not found");
+		err = -ENODEV;
+		goto dev_err;
+	}
+	vdev = container_of(dev, struct vdpa_device, dev);
+	if (!vdev->mdev) {
+		NL_SET_ERR_MSG_MOD(info->extack, "Only user created device can be deleted by user");
+		err = -EINVAL;
+		goto mdev_err;
+	}
+	mdev = vdev->mdev;
+	mdev->ops->dev_del(mdev, vdev);
+mdev_err:
+	put_device(dev);
+dev_err:
+	mutex_unlock(&vdpa_dev_mutex);
+	return err;
+}
+
 static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX] = {
 	[VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING },
+	[VDPA_ATTR_DEV_NAME] = { .type = NLA_STRING },
 };
 
 static const struct genl_ops vdpa_nl_ops[] = {
@@ -380,6 +491,18 @@ static const struct genl_ops vdpa_nl_ops[] = {
 		.doit = vdpa_nl_cmd_mgmtdev_get_doit,
 		.dumpit = vdpa_nl_cmd_mgmtdev_get_dumpit,
 	},
+	{
+		.cmd = VDPA_CMD_DEV_NEW,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_dev_add_set_doit,
+		.flags = GENL_ADMIN_PERM,
+	},
+	{
+		.cmd = VDPA_CMD_DEV_DEL,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_dev_del_set_doit,
+		.flags = GENL_ADMIN_PERM,
+	},
 };
 
 static struct genl_family vdpa_nl_family __ro_after_init = {
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 6b8b4222bca6..4ab5494503a8 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -45,6 +45,8 @@ struct vdpa_mgmt_dev;
  * @index: device index
  * @features_valid: were features initialized? for legacy guests
  * @nvqs: maximum number of supported virtqueues
+ * @mdev: management device pointer; caller must setup when registering device as part
+ *	  of dev_add() mgmtdev ops callback before invoking _vdpa_register_device().
  */
 struct vdpa_device {
 	struct device dev;
@@ -53,6 +55,7 @@ struct vdpa_device {
 	unsigned int index;
 	bool features_valid;
 	int nvqs;
+	struct vdpa_mgmt_dev *mdev;
 };
 
 /**
@@ -260,6 +263,9 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
 int vdpa_register_device(struct vdpa_device *vdev);
 void vdpa_unregister_device(struct vdpa_device *vdev);
 
+int _vdpa_register_device(struct vdpa_device *vdev);
+void _vdpa_unregister_device(struct vdpa_device *vdev);
+
 /**
  * vdpa_driver - operations for a vDPA driver
  * @driver: underlying device driver
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index d44d82e567b1..bb4a1f00eb1c 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -14,6 +14,8 @@ enum vdpa_command {
 	VDPA_CMD_UNSPEC,
 	VDPA_CMD_MGMTDEV_NEW,
 	VDPA_CMD_MGMTDEV_GET,		/* can dump */
+	VDPA_CMD_DEV_NEW,
+	VDPA_CMD_DEV_DEL,
 };
 
 enum vdpa_attr {
@@ -24,6 +26,8 @@ enum vdpa_attr {
 	VDPA_ATTR_MGMTDEV_DEV_NAME,		/* string */
 	VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES,	/* u64 */
 
+	VDPA_ATTR_DEV_NAME,			/* string */
+
 	/* new attributes must be added above here */
 	VDPA_ATTR_MAX,
 };
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v3 5/6] vdpa: Enable user to query vdpa device info
  2021-01-05 10:31 ` [PATCH linux-next v3 0/6] Introduce vdpa management tool Parav Pandit
                     ` (3 preceding siblings ...)
  2021-01-05 10:32   ` [PATCH linux-next v3 4/6] vdpa: Enable a user to add and delete a vdpa device Parav Pandit
@ 2021-01-05 10:32   ` Parav Pandit
  2021-01-05 10:32   ` [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices Parav Pandit
  5 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-05 10:32 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

Enable user to query vdpa device information.

$ vdpa dev add mgmtdev vdpasim_net name foo2

Show the newly created vdpa device by its name:
$ vdpa dev show foo2
foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2 max_vq_size 256

$ vdpa dev show foo2 -jp
{
    "dev": {
        "foo2": {
            "type": "network",
            "mgmtdev": "vdpasim_net",
            "vendor_id": 0,
            "max_vqs": 2,
            "max_vq_size": 256
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vdpa/vdpa.c       | 131 ++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/vdpa.h |   5 ++
 2 files changed, 136 insertions(+)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index dca67e4d32e5..9700a0adcca0 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -478,6 +478,131 @@ static int vdpa_nl_cmd_dev_del_set_doit(struct sk_buff *skb, struct genl_info *i
 	return err;
 }
 
+static int
+vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq,
+	      int flags, struct netlink_ext_ack *extack)
+{
+	u16 max_vq_size;
+	u32 device_id;
+	u32 vendor_id;
+	void *hdr;
+	int err;
+
+	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags, VDPA_CMD_DEV_NEW);
+	if (!hdr)
+		return -EMSGSIZE;
+
+	err = vdpa_nl_mgmtdev_handle_fill(msg, vdev->mdev);
+	if (err)
+		goto msg_err;
+
+	device_id = vdev->config->get_device_id(vdev);
+	vendor_id = vdev->config->get_vendor_id(vdev);
+	max_vq_size = vdev->config->get_vq_num_max(vdev);
+
+	err = -EMSGSIZE;
+	if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
+		goto msg_err;
+	if (nla_put_u32(msg, VDPA_ATTR_DEV_ID, device_id))
+		goto msg_err;
+	if (nla_put_u32(msg, VDPA_ATTR_DEV_VENDOR_ID, vendor_id))
+		goto msg_err;
+	if (nla_put_u32(msg, VDPA_ATTR_DEV_MAX_VQS, vdev->nvqs))
+		goto msg_err;
+	if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
+		goto msg_err;
+
+	genlmsg_end(msg, hdr);
+	return 0;
+
+msg_err:
+	genlmsg_cancel(msg, hdr);
+	return err;
+}
+
+static int vdpa_nl_cmd_dev_get_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct vdpa_device *vdev;
+	struct sk_buff *msg;
+	const char *devname;
+	struct device *dev;
+	int err;
+
+	if (!info->attrs[VDPA_ATTR_DEV_NAME])
+		return -EINVAL;
+	devname = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	mutex_lock(&vdpa_dev_mutex);
+	dev = bus_find_device(&vdpa_bus, NULL, devname, vdpa_name_match);
+	if (!dev) {
+		mutex_unlock(&vdpa_dev_mutex);
+		NL_SET_ERR_MSG_MOD(info->extack, "device not found");
+		return -ENODEV;
+	}
+	vdev = container_of(dev, struct vdpa_device, dev);
+	if (!vdev->mdev) {
+		mutex_unlock(&vdpa_dev_mutex);
+		put_device(dev);
+		return -EINVAL;
+	}
+	err = vdpa_dev_fill(vdev, msg, info->snd_portid, info->snd_seq, 0, info->extack);
+	if (!err)
+		err = genlmsg_reply(msg, info);
+	put_device(dev);
+	mutex_unlock(&vdpa_dev_mutex);
+
+	if (err)
+		nlmsg_free(msg);
+	return err;
+}
+
+struct vdpa_dev_dump_info {
+	struct sk_buff *msg;
+	struct netlink_callback *cb;
+	int start_idx;
+	int idx;
+};
+
+static int vdpa_dev_dump(struct device *dev, void *data)
+{
+	struct vdpa_device *vdev = container_of(dev, struct vdpa_device, dev);
+	struct vdpa_dev_dump_info *info = data;
+	int err;
+
+	if (!vdev->mdev)
+		return 0;
+	if (info->idx < info->start_idx) {
+		info->idx++;
+		return 0;
+	}
+	err = vdpa_dev_fill(vdev, info->msg, NETLINK_CB(info->cb->skb).portid,
+			    info->cb->nlh->nlmsg_seq, NLM_F_MULTI, info->cb->extack);
+	if (err)
+		return err;
+
+	info->idx++;
+	return 0;
+}
+
+static int vdpa_nl_cmd_dev_get_dumpit(struct sk_buff *msg, struct netlink_callback *cb)
+{
+	struct vdpa_dev_dump_info info;
+
+	info.msg = msg;
+	info.cb = cb;
+	info.start_idx = cb->args[0];
+	info.idx = 0;
+
+	mutex_lock(&vdpa_dev_mutex);
+	bus_for_each_dev(&vdpa_bus, NULL, &info, vdpa_dev_dump);
+	mutex_unlock(&vdpa_dev_mutex);
+	cb->args[0] = info.idx;
+	return msg->len;
+}
+
 static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX] = {
 	[VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING },
@@ -503,6 +628,12 @@ static const struct genl_ops vdpa_nl_ops[] = {
 		.doit = vdpa_nl_cmd_dev_del_set_doit,
 		.flags = GENL_ADMIN_PERM,
 	},
+	{
+		.cmd = VDPA_CMD_DEV_GET,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = vdpa_nl_cmd_dev_get_doit,
+		.dumpit = vdpa_nl_cmd_dev_get_dumpit,
+	},
 };
 
 static struct genl_family vdpa_nl_family __ro_after_init = {
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index bb4a1f00eb1c..66a41e4ec163 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -16,6 +16,7 @@ enum vdpa_command {
 	VDPA_CMD_MGMTDEV_GET,		/* can dump */
 	VDPA_CMD_DEV_NEW,
 	VDPA_CMD_DEV_DEL,
+	VDPA_CMD_DEV_GET,		/* can dump */
 };
 
 enum vdpa_attr {
@@ -27,6 +28,10 @@ enum vdpa_attr {
 	VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES,	/* u64 */
 
 	VDPA_ATTR_DEV_NAME,			/* string */
+	VDPA_ATTR_DEV_ID,			/* u32 */
+	VDPA_ATTR_DEV_VENDOR_ID,		/* u32 */
+	VDPA_ATTR_DEV_MAX_VQS,			/* u32 */
+	VDPA_ATTR_DEV_MAX_VQ_SIZE,		/* u16 */
 
 	/* new attributes must be added above here */
 	VDPA_ATTR_MAX,
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-05 10:31 ` [PATCH linux-next v3 0/6] Introduce vdpa management tool Parav Pandit
                     ` (4 preceding siblings ...)
  2021-01-05 10:32   ` [PATCH linux-next v3 5/6] vdpa: Enable user to query vdpa device info Parav Pandit
@ 2021-01-05 10:32   ` Parav Pandit
  2021-01-05 11:48     ` Michael S. Tsirkin
  5 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-05 10:32 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, parav, elic, netdev

Enable user to create vdpasim net simulate devices.

Show vdpa management device that supports creating, deleting vdpa devices.

$ vdpa mgmtdev show
vdpasim_net:
  supported_classes
    net

$ vdpa mgmtdev show -jp
{
    "show": {
        "vdpasim_net": {
            "supported_classes": {
              "net"
        }
    }
}

Create a vdpa device of type networking named as "foo2" from
the management device vdpasim:

$ vdpa dev add mgmtdev vdpasim_net name foo2

Show the newly created vdpa device by its name:
$ vdpa dev show foo2
foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2 max_vq_size 256

$ vdpa dev show foo2 -jp
{
    "dev": {
        "foo2": {
            "type": "network",
            "mgmtdev": "vdpasim_net",
            "vendor_id": 0,
            "max_vqs": 2,
            "max_vq_size": 256
        }
    }
}

Delete the vdpa device after its use:
$ vdpa dev del foo2

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
Changelog:
v2->v3:
 - removed code branches due to default device removal patch
v1->v2:
 - rebased
---
 drivers/vdpa/vdpa_sim/vdpa_sim.c     |  3 +-
 drivers/vdpa/vdpa_sim/vdpa_sim.h     |  2 +
 drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 96 ++++++++++++++++++++--------
 3 files changed, 75 insertions(+), 26 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index db1636a99ba4..d5942842432d 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -235,7 +235,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
 		ops = &vdpasim_config_ops;
 
 	vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
-				    dev_attr->nvqs, NULL);
+				    dev_attr->nvqs, dev_attr->name);
 	if (!vdpasim)
 		goto err_alloc;
 
@@ -249,6 +249,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
 	if (dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)))
 		goto err_iommu;
 	set_dma_ops(dev, &vdpasim_dma_ops);
+	vdpasim->vdpa.mdev = dev_attr->mgmt_dev;
 
 	vdpasim->config = kzalloc(dev_attr->config_size, GFP_KERNEL);
 	if (!vdpasim->config)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
index b02142293d5b..6d75444f9948 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
@@ -33,6 +33,8 @@ struct vdpasim_virtqueue {
 };
 
 struct vdpasim_dev_attr {
+	struct vdpa_mgmt_dev *mgmt_dev;
+	const char *name;
 	u64 supported_features;
 	size_t config_size;
 	size_t buffer_size;
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
index f0482427186b..d344c5b7c914 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -35,8 +35,6 @@ MODULE_PARM_DESC(macaddr, "Ethernet MAC address");
 
 static u8 macaddr_buf[ETH_ALEN];
 
-static struct vdpasim *vdpasim_net_dev;
-
 static void vdpasim_net_work(struct work_struct *work)
 {
 	struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
@@ -120,21 +118,23 @@ static void vdpasim_net_get_config(struct vdpasim *vdpasim, void *config)
 	memcpy(net_config->mac, macaddr_buf, ETH_ALEN);
 }
 
-static int __init vdpasim_net_init(void)
+static void vdpasim_net_mgmtdev_release(struct device *dev)
+{
+}
+
+static struct device vdpasim_net_mgmtdev = {
+	.init_name = "vdpasim_net",
+	.release = vdpasim_net_mgmtdev_release,
+};
+
+static int vdpasim_net_dev_add(struct vdpa_mgmt_dev *mdev, const char *name)
 {
 	struct vdpasim_dev_attr dev_attr = {};
+	struct vdpasim *simdev;
 	int ret;
 
-	if (macaddr) {
-		mac_pton(macaddr, macaddr_buf);
-		if (!is_valid_ether_addr(macaddr_buf)) {
-			ret = -EADDRNOTAVAIL;
-			goto out;
-		}
-	} else {
-		eth_random_addr(macaddr_buf);
-	}
-
+	dev_attr.mgmt_dev = mdev;
+	dev_attr.name = name;
 	dev_attr.id = VIRTIO_ID_NET;
 	dev_attr.supported_features = VDPASIM_NET_FEATURES;
 	dev_attr.nvqs = VDPASIM_NET_VQ_NUM;
@@ -143,29 +143,75 @@ static int __init vdpasim_net_init(void)
 	dev_attr.work_fn = vdpasim_net_work;
 	dev_attr.buffer_size = PAGE_SIZE;
 
-	vdpasim_net_dev = vdpasim_create(&dev_attr);
-	if (IS_ERR(vdpasim_net_dev)) {
-		ret = PTR_ERR(vdpasim_net_dev);
-		goto out;
+	simdev = vdpasim_create(&dev_attr);
+	if (IS_ERR(simdev))
+		return PTR_ERR(simdev);
+
+	ret = _vdpa_register_device(&simdev->vdpa);
+	if (ret)
+		goto reg_err;
+
+	return 0;
+
+reg_err:
+	put_device(&simdev->vdpa.dev);
+	return ret;
+}
+
+static void vdpasim_net_dev_del(struct vdpa_mgmt_dev *mdev,
+				struct vdpa_device *dev)
+{
+	struct vdpasim *simdev = container_of(dev, struct vdpasim, vdpa);
+
+	_vdpa_unregister_device(&simdev->vdpa);
+}
+
+static const struct vdpa_mgmtdev_ops vdpasim_net_mgmtdev_ops = {
+	.dev_add = vdpasim_net_dev_add,
+	.dev_del = vdpasim_net_dev_del
+};
+
+static struct virtio_device_id id_table[] = {
+	{ VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
+	{ 0 },
+};
+
+static struct vdpa_mgmt_dev mgmt_dev = {
+	.device = &vdpasim_net_mgmtdev,
+	.id_table = id_table,
+	.ops = &vdpasim_net_mgmtdev_ops,
+};
+
+static int __init vdpasim_net_init(void)
+{
+	int ret;
+
+	if (macaddr) {
+		mac_pton(macaddr, macaddr_buf);
+		if (!is_valid_ether_addr(macaddr_buf))
+			return -EADDRNOTAVAIL;
+	} else {
+		eth_random_addr(macaddr_buf);
 	}
 
-	ret = vdpa_register_device(&vdpasim_net_dev->vdpa);
+	ret = device_register(&vdpasim_net_mgmtdev);
 	if (ret)
-		goto put_dev;
+		return ret;
 
+	ret = vdpa_mgmtdev_register(&mgmt_dev);
+	if (ret)
+		goto parent_err;
 	return 0;
 
-put_dev:
-	put_device(&vdpasim_net_dev->vdpa.dev);
-out:
+parent_err:
+	device_unregister(&vdpasim_net_mgmtdev);
 	return ret;
 }
 
 static void __exit vdpasim_net_exit(void)
 {
-	struct vdpa_device *vdpa = &vdpasim_net_dev->vdpa;
-
-	vdpa_unregister_device(vdpa);
+	vdpa_mgmtdev_unregister(&mgmt_dev);
+	device_unregister(&vdpasim_net_mgmtdev);
 }
 
 module_init(vdpasim_net_init);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-05 10:32   ` [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices Parav Pandit
@ 2021-01-05 11:48     ` Michael S. Tsirkin
  2021-01-05 12:02       ` Parav Pandit
  0 siblings, 1 reply; 79+ messages in thread
From: Michael S. Tsirkin @ 2021-01-05 11:48 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtualization, jasowang, elic, netdev

On Tue, Jan 05, 2021 at 12:32:03PM +0200, Parav Pandit wrote:
> Enable user to create vdpasim net simulate devices.
> 
> Show vdpa management device that supports creating, deleting vdpa devices.
> 
> $ vdpa mgmtdev show
> vdpasim_net:
>   supported_classes
>     net
> 
> $ vdpa mgmtdev show -jp
> {
>     "show": {
>         "vdpasim_net": {
>             "supported_classes": {
>               "net"
>         }
>     }
> }
> 
> Create a vdpa device of type networking named as "foo2" from
> the management device vdpasim:
> 
> $ vdpa dev add mgmtdev vdpasim_net name foo2
> 
> Show the newly created vdpa device by its name:
> $ vdpa dev show foo2
> foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2 max_vq_size 256
> 
> $ vdpa dev show foo2 -jp
> {
>     "dev": {
>         "foo2": {
>             "type": "network",
>             "mgmtdev": "vdpasim_net",
>             "vendor_id": 0,
>             "max_vqs": 2,
>             "max_vq_size": 256
>         }
>     }
> }


I'd like an example of how do device specific
(e.g. net specific) interfaces tie in to this.


> Delete the vdpa device after its use:
> $ vdpa dev del foo2
> 
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Eli Cohen <elic@nvidia.com>
> Acked-by: Jason Wang <jasowang@redhat.com>
> ---
> Changelog:
> v2->v3:
>  - removed code branches due to default device removal patch
> v1->v2:
>  - rebased
> ---
>  drivers/vdpa/vdpa_sim/vdpa_sim.c     |  3 +-
>  drivers/vdpa/vdpa_sim/vdpa_sim.h     |  2 +
>  drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 96 ++++++++++++++++++++--------
>  3 files changed, 75 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index db1636a99ba4..d5942842432d 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -235,7 +235,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
>  		ops = &vdpasim_config_ops;
>  
>  	vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
> -				    dev_attr->nvqs, NULL);
> +				    dev_attr->nvqs, dev_attr->name);
>  	if (!vdpasim)
>  		goto err_alloc;
>  
> @@ -249,6 +249,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
>  	if (dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)))
>  		goto err_iommu;
>  	set_dma_ops(dev, &vdpasim_dma_ops);
> +	vdpasim->vdpa.mdev = dev_attr->mgmt_dev;
>  
>  	vdpasim->config = kzalloc(dev_attr->config_size, GFP_KERNEL);
>  	if (!vdpasim->config)
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> index b02142293d5b..6d75444f9948 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> @@ -33,6 +33,8 @@ struct vdpasim_virtqueue {
>  };
>  
>  struct vdpasim_dev_attr {
> +	struct vdpa_mgmt_dev *mgmt_dev;
> +	const char *name;
>  	u64 supported_features;
>  	size_t config_size;
>  	size_t buffer_size;
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> index f0482427186b..d344c5b7c914 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> @@ -35,8 +35,6 @@ MODULE_PARM_DESC(macaddr, "Ethernet MAC address");
>  
>  static u8 macaddr_buf[ETH_ALEN];
>  
> -static struct vdpasim *vdpasim_net_dev;
> -
>  static void vdpasim_net_work(struct work_struct *work)
>  {
>  	struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> @@ -120,21 +118,23 @@ static void vdpasim_net_get_config(struct vdpasim *vdpasim, void *config)
>  	memcpy(net_config->mac, macaddr_buf, ETH_ALEN);
>  }
>  
> -static int __init vdpasim_net_init(void)
> +static void vdpasim_net_mgmtdev_release(struct device *dev)
> +{
> +}
> +
> +static struct device vdpasim_net_mgmtdev = {
> +	.init_name = "vdpasim_net",
> +	.release = vdpasim_net_mgmtdev_release,
> +};
> +
> +static int vdpasim_net_dev_add(struct vdpa_mgmt_dev *mdev, const char *name)
>  {
>  	struct vdpasim_dev_attr dev_attr = {};
> +	struct vdpasim *simdev;
>  	int ret;
>  
> -	if (macaddr) {
> -		mac_pton(macaddr, macaddr_buf);
> -		if (!is_valid_ether_addr(macaddr_buf)) {
> -			ret = -EADDRNOTAVAIL;
> -			goto out;
> -		}
> -	} else {
> -		eth_random_addr(macaddr_buf);
> -	}
> -
> +	dev_attr.mgmt_dev = mdev;
> +	dev_attr.name = name;
>  	dev_attr.id = VIRTIO_ID_NET;
>  	dev_attr.supported_features = VDPASIM_NET_FEATURES;
>  	dev_attr.nvqs = VDPASIM_NET_VQ_NUM;
> @@ -143,29 +143,75 @@ static int __init vdpasim_net_init(void)
>  	dev_attr.work_fn = vdpasim_net_work;
>  	dev_attr.buffer_size = PAGE_SIZE;
>  
> -	vdpasim_net_dev = vdpasim_create(&dev_attr);
> -	if (IS_ERR(vdpasim_net_dev)) {
> -		ret = PTR_ERR(vdpasim_net_dev);
> -		goto out;
> +	simdev = vdpasim_create(&dev_attr);
> +	if (IS_ERR(simdev))
> +		return PTR_ERR(simdev);
> +
> +	ret = _vdpa_register_device(&simdev->vdpa);
> +	if (ret)
> +		goto reg_err;
> +
> +	return 0;
> +
> +reg_err:
> +	put_device(&simdev->vdpa.dev);
> +	return ret;
> +}
> +
> +static void vdpasim_net_dev_del(struct vdpa_mgmt_dev *mdev,
> +				struct vdpa_device *dev)
> +{
> +	struct vdpasim *simdev = container_of(dev, struct vdpasim, vdpa);
> +
> +	_vdpa_unregister_device(&simdev->vdpa);
> +}
> +
> +static const struct vdpa_mgmtdev_ops vdpasim_net_mgmtdev_ops = {
> +	.dev_add = vdpasim_net_dev_add,
> +	.dev_del = vdpasim_net_dev_del
> +};
> +
> +static struct virtio_device_id id_table[] = {
> +	{ VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
> +	{ 0 },
> +};
> +
> +static struct vdpa_mgmt_dev mgmt_dev = {
> +	.device = &vdpasim_net_mgmtdev,
> +	.id_table = id_table,
> +	.ops = &vdpasim_net_mgmtdev_ops,
> +};
> +
> +static int __init vdpasim_net_init(void)
> +{
> +	int ret;
> +
> +	if (macaddr) {
> +		mac_pton(macaddr, macaddr_buf);
> +		if (!is_valid_ether_addr(macaddr_buf))
> +			return -EADDRNOTAVAIL;
> +	} else {
> +		eth_random_addr(macaddr_buf);
>  	}

Hmm so all devices start out with the same MAC
until changed? And how is the change effected?


> -	ret = vdpa_register_device(&vdpasim_net_dev->vdpa);
> +	ret = device_register(&vdpasim_net_mgmtdev);
>  	if (ret)
> -		goto put_dev;
> +		return ret;
>  
> +	ret = vdpa_mgmtdev_register(&mgmt_dev);
> +	if (ret)
> +		goto parent_err;
>  	return 0;
>  
> -put_dev:
> -	put_device(&vdpasim_net_dev->vdpa.dev);
> -out:
> +parent_err:
> +	device_unregister(&vdpasim_net_mgmtdev);
>  	return ret;
>  }
>  
>  static void __exit vdpasim_net_exit(void)
>  {
> -	struct vdpa_device *vdpa = &vdpasim_net_dev->vdpa;
> -
> -	vdpa_unregister_device(vdpa);
> +	vdpa_mgmtdev_unregister(&mgmt_dev);
> +	device_unregister(&vdpasim_net_mgmtdev);
>  }
>  
>  module_init(vdpasim_net_init);
> -- 
> 2.26.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-05 11:48     ` Michael S. Tsirkin
@ 2021-01-05 12:02       ` Parav Pandit
  2021-01-05 12:14         ` Michael S. Tsirkin
  0 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-05 12:02 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtualization, jasowang, Eli Cohen, netdev



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, January 5, 2021 5:19 PM
> 
> On Tue, Jan 05, 2021 at 12:32:03PM +0200, Parav Pandit wrote:
> > Enable user to create vdpasim net simulate devices.
> >
> >

> > $ vdpa dev add mgmtdev vdpasim_net name foo2
> >
> > Show the newly created vdpa device by its name:
> > $ vdpa dev show foo2
> > foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2
> > max_vq_size 256
> >
> > $ vdpa dev show foo2 -jp
> > {
> >     "dev": {
> >         "foo2": {
> >             "type": "network",
> >             "mgmtdev": "vdpasim_net",
> >             "vendor_id": 0,
> >             "max_vqs": 2,
> >             "max_vq_size": 256
> >         }
> >     }
> > }
> 
> 
> I'd like an example of how do device specific (e.g. net specific) interfaces tie
> in to this.
Not sure I follow your question.
Do you mean how to set mac address or mtu of this vdpa device of type net?
If so, dev add command will be extended shortly in subsequent series to set this net specific attributes.
(I did mention in the next steps in cover letter).

> > +static int __init vdpasim_net_init(void) {
> > +	int ret;
> > +
> > +	if (macaddr) {
> > +		mac_pton(macaddr, macaddr_buf);
> > +		if (!is_valid_ether_addr(macaddr_buf))
> > +			return -EADDRNOTAVAIL;
> > +	} else {
> > +		eth_random_addr(macaddr_buf);
> >  	}
> 
> Hmm so all devices start out with the same MAC until changed? And how is
> the change effected?
Post this patchset and post we have iproute2 vdpa in the tree, will add the mac address as the input attribute during "vdpa dev add" command.
So that each different vdpa device can have user specified (different) mac address.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-05 12:02       ` Parav Pandit
@ 2021-01-05 12:14         ` Michael S. Tsirkin
  2021-01-05 12:30           ` Parav Pandit
  0 siblings, 1 reply; 79+ messages in thread
From: Michael S. Tsirkin @ 2021-01-05 12:14 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtualization, jasowang, Eli Cohen, netdev

On Tue, Jan 05, 2021 at 12:02:33PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, January 5, 2021 5:19 PM
> > 
> > On Tue, Jan 05, 2021 at 12:32:03PM +0200, Parav Pandit wrote:
> > > Enable user to create vdpasim net simulate devices.
> > >
> > >
> 
> > > $ vdpa dev add mgmtdev vdpasim_net name foo2
> > >
> > > Show the newly created vdpa device by its name:
> > > $ vdpa dev show foo2
> > > foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2
> > > max_vq_size 256
> > >
> > > $ vdpa dev show foo2 -jp
> > > {
> > >     "dev": {
> > >         "foo2": {
> > >             "type": "network",
> > >             "mgmtdev": "vdpasim_net",
> > >             "vendor_id": 0,
> > >             "max_vqs": 2,
> > >             "max_vq_size": 256
> > >         }
> > >     }
> > > }
> > 
> > 
> > I'd like an example of how do device specific (e.g. net specific) interfaces tie
> > in to this.
> Not sure I follow your question.
> Do you mean how to set mac address or mtu of this vdpa device of type net?
> If so, dev add command will be extended shortly in subsequent series to set this net specific attributes.
> (I did mention in the next steps in cover letter).
> 
> > > +static int __init vdpasim_net_init(void) {
> > > +	int ret;
> > > +
> > > +	if (macaddr) {
> > > +		mac_pton(macaddr, macaddr_buf);
> > > +		if (!is_valid_ether_addr(macaddr_buf))
> > > +			return -EADDRNOTAVAIL;
> > > +	} else {
> > > +		eth_random_addr(macaddr_buf);
> > >  	}
> > 
> > Hmm so all devices start out with the same MAC until changed? And how is
> > the change effected?
> Post this patchset and post we have iproute2 vdpa in the tree, will add the mac address as the input attribute during "vdpa dev add" command.
> So that each different vdpa device can have user specified (different) mac address.

For now maybe just avoid VIRTIO_NET_F_MAC then for new devices then?

-- 
MST


^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-05 12:14         ` Michael S. Tsirkin
@ 2021-01-05 12:30           ` Parav Pandit
  2021-01-05 13:23             ` Michael S. Tsirkin
  0 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-05 12:30 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtualization, jasowang, Eli Cohen, netdev



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, January 5, 2021 5:45 PM
> 
> On Tue, Jan 05, 2021 at 12:02:33PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Tuesday, January 5, 2021 5:19 PM
> > >
> > > On Tue, Jan 05, 2021 at 12:32:03PM +0200, Parav Pandit wrote:
> > > > Enable user to create vdpasim net simulate devices.
> > > >
> > > >
> >
> > > > $ vdpa dev add mgmtdev vdpasim_net name foo2
> > > >
> > > > Show the newly created vdpa device by its name:
> > > > $ vdpa dev show foo2
> > > > foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2
> > > > max_vq_size 256
> > > >
> > > > $ vdpa dev show foo2 -jp
> > > > {
> > > >     "dev": {
> > > >         "foo2": {
> > > >             "type": "network",
> > > >             "mgmtdev": "vdpasim_net",
> > > >             "vendor_id": 0,
> > > >             "max_vqs": 2,
> > > >             "max_vq_size": 256
> > > >         }
> > > >     }
> > > > }
> > >
> > >
> > > I'd like an example of how do device specific (e.g. net specific)
> > > interfaces tie in to this.
> > Not sure I follow your question.
> > Do you mean how to set mac address or mtu of this vdpa device of type
> net?
> > If so, dev add command will be extended shortly in subsequent series to
> set this net specific attributes.
> > (I did mention in the next steps in cover letter).
> >
> > > > +static int __init vdpasim_net_init(void) {
> > > > +	int ret;
> > > > +
> > > > +	if (macaddr) {
> > > > +		mac_pton(macaddr, macaddr_buf);
> > > > +		if (!is_valid_ether_addr(macaddr_buf))
> > > > +			return -EADDRNOTAVAIL;
> > > > +	} else {
> > > > +		eth_random_addr(macaddr_buf);
> > > >  	}
> > >
> > > Hmm so all devices start out with the same MAC until changed? And
> > > how is the change effected?
> > Post this patchset and post we have iproute2 vdpa in the tree, will add the
> mac address as the input attribute during "vdpa dev add" command.
> > So that each different vdpa device can have user specified (different) mac
> address.
> 
> For now maybe just avoid VIRTIO_NET_F_MAC then for new devices then?

That would require book keeping existing net vdpa_sim devices created to avoid setting VIRTIO_NET_F_MAC.
Such book keeping code will be short lived anyway.
Not sure if its worth it.
Until now only one device was created. So not sure two vdpa devices with same mac address will be a real issue.

When we add mac address attribute in add command, at that point also remove the module parameter macaddr.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-05 12:30           ` Parav Pandit
@ 2021-01-05 13:23             ` Michael S. Tsirkin
  2021-01-07  3:48               ` Parav Pandit
  0 siblings, 1 reply; 79+ messages in thread
From: Michael S. Tsirkin @ 2021-01-05 13:23 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtualization, jasowang, Eli Cohen, netdev

On Tue, Jan 05, 2021 at 12:30:15PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, January 5, 2021 5:45 PM
> > 
> > On Tue, Jan 05, 2021 at 12:02:33PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Tuesday, January 5, 2021 5:19 PM
> > > >
> > > > On Tue, Jan 05, 2021 at 12:32:03PM +0200, Parav Pandit wrote:
> > > > > Enable user to create vdpasim net simulate devices.
> > > > >
> > > > >
> > >
> > > > > $ vdpa dev add mgmtdev vdpasim_net name foo2
> > > > >
> > > > > Show the newly created vdpa device by its name:
> > > > > $ vdpa dev show foo2
> > > > > foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2
> > > > > max_vq_size 256
> > > > >
> > > > > $ vdpa dev show foo2 -jp
> > > > > {
> > > > >     "dev": {
> > > > >         "foo2": {
> > > > >             "type": "network",
> > > > >             "mgmtdev": "vdpasim_net",
> > > > >             "vendor_id": 0,
> > > > >             "max_vqs": 2,
> > > > >             "max_vq_size": 256
> > > > >         }
> > > > >     }
> > > > > }
> > > >
> > > >
> > > > I'd like an example of how do device specific (e.g. net specific)
> > > > interfaces tie in to this.
> > > Not sure I follow your question.
> > > Do you mean how to set mac address or mtu of this vdpa device of type
> > net?
> > > If so, dev add command will be extended shortly in subsequent series to
> > set this net specific attributes.
> > > (I did mention in the next steps in cover letter).
> > >
> > > > > +static int __init vdpasim_net_init(void) {
> > > > > +	int ret;
> > > > > +
> > > > > +	if (macaddr) {
> > > > > +		mac_pton(macaddr, macaddr_buf);
> > > > > +		if (!is_valid_ether_addr(macaddr_buf))
> > > > > +			return -EADDRNOTAVAIL;
> > > > > +	} else {
> > > > > +		eth_random_addr(macaddr_buf);
> > > > >  	}
> > > >
> > > > Hmm so all devices start out with the same MAC until changed? And
> > > > how is the change effected?
> > > Post this patchset and post we have iproute2 vdpa in the tree, will add the
> > mac address as the input attribute during "vdpa dev add" command.
> > > So that each different vdpa device can have user specified (different) mac
> > address.
> > 
> > For now maybe just avoid VIRTIO_NET_F_MAC then for new devices then?
> 
> That would require book keeping existing net vdpa_sim devices created to avoid setting VIRTIO_NET_F_MAC.
> Such book keeping code will be short lived anyway.
> Not sure if its worth it.
> Until now only one device was created. So not sure two vdpa devices with same mac address will be a real issue.
> 
> When we add mac address attribute in add command, at that point also remove the module parameter macaddr.

Will that be mandatory? I'm not to happy with a UAPI we intend to break
straight away ...

-- 
MST


^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-05 13:23             ` Michael S. Tsirkin
@ 2021-01-07  3:48               ` Parav Pandit
  2021-01-12  4:14                 ` Parav Pandit
  2021-01-14  4:17                 ` Jason Wang
  0 siblings, 2 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-07  3:48 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtualization, jasowang, Eli Cohen, netdev



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, January 5, 2021 6:53 PM
> 
> On Tue, Jan 05, 2021 at 12:30:15PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Tuesday, January 5, 2021 5:45 PM
> > >
> > > On Tue, Jan 05, 2021 at 12:02:33PM +0000, Parav Pandit wrote:
> > > >
> > > >
> > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > Sent: Tuesday, January 5, 2021 5:19 PM
> > > > >
> > > > > On Tue, Jan 05, 2021 at 12:32:03PM +0200, Parav Pandit wrote:
> > > > > > Enable user to create vdpasim net simulate devices.
> > > > > >
> > > > > >
> > > >
> > > > > > $ vdpa dev add mgmtdev vdpasim_net name foo2
> > > > > >
> > > > > > Show the newly created vdpa device by its name:
> > > > > > $ vdpa dev show foo2
> > > > > > foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2
> > > > > > max_vq_size 256
> > > > > >
> > > > > > $ vdpa dev show foo2 -jp
> > > > > > {
> > > > > >     "dev": {
> > > > > >         "foo2": {
> > > > > >             "type": "network",
> > > > > >             "mgmtdev": "vdpasim_net",
> > > > > >             "vendor_id": 0,
> > > > > >             "max_vqs": 2,
> > > > > >             "max_vq_size": 256
> > > > > >         }
> > > > > >     }
> > > > > > }
> > > > >
> > > > >
> > > > > I'd like an example of how do device specific (e.g. net
> > > > > specific) interfaces tie in to this.
> > > > Not sure I follow your question.
> > > > Do you mean how to set mac address or mtu of this vdpa device of
> > > > type
> > > net?
> > > > If so, dev add command will be extended shortly in subsequent
> > > > series to
> > > set this net specific attributes.
> > > > (I did mention in the next steps in cover letter).
> > > >
> > > > > > +static int __init vdpasim_net_init(void) {
> > > > > > +	int ret;
> > > > > > +
> > > > > > +	if (macaddr) {
> > > > > > +		mac_pton(macaddr, macaddr_buf);
> > > > > > +		if (!is_valid_ether_addr(macaddr_buf))
> > > > > > +			return -EADDRNOTAVAIL;
> > > > > > +	} else {
> > > > > > +		eth_random_addr(macaddr_buf);
> > > > > >  	}
> > > > >
> > > > > Hmm so all devices start out with the same MAC until changed?
> > > > > And how is the change effected?
> > > > Post this patchset and post we have iproute2 vdpa in the tree,
> > > > will add the
> > > mac address as the input attribute during "vdpa dev add" command.
> > > > So that each different vdpa device can have user specified
> > > > (different) mac
> > > address.
> > >
> > > For now maybe just avoid VIRTIO_NET_F_MAC then for new devices
> then?
> >
> > That would require book keeping existing net vdpa_sim devices created to
> avoid setting VIRTIO_NET_F_MAC.
> > Such book keeping code will be short lived anyway.
> > Not sure if its worth it.
> > Until now only one device was created. So not sure two vdpa devices with
> same mac address will be a real issue.
> >
> > When we add mac address attribute in add command, at that point also
> remove the module parameter macaddr.
> 
> Will that be mandatory? I'm not to happy with a UAPI we intend to break
> straight away ...
No. Specifying mac address shouldn't be mandatory. UAPI wont' be broken.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v3 1/6] vdpa_sim_net: Make mac address array static
  2021-01-05 10:31   ` [PATCH linux-next v3 1/6] vdpa_sim_net: Make mac address array static Parav Pandit
@ 2021-01-07 13:45     ` Stefano Garzarella
  0 siblings, 0 replies; 79+ messages in thread
From: Stefano Garzarella @ 2021-01-07 13:45 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtualization, netdev, elic, mst

On Tue, Jan 05, 2021 at 12:31:58PM +0200, Parav Pandit wrote:
>MAC address array is used only in vdpa_sim_net.c.
>Hence, keep it static.
>
>Signed-off-by: Parav Pandit <parav@nvidia.com>
>Acked-by: Jason Wang <jasowang@redhat.com>
>---
>Changelog:
>v1->v2:
> - new patch
>---
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>

>
>diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
>index c10b6981fdab..f0482427186b 100644
>--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
>+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
>@@ -33,7 +33,7 @@ static char *macaddr;
> module_param(macaddr, charp, 0);
> MODULE_PARM_DESC(macaddr, "Ethernet MAC address");
>
>-u8 macaddr_buf[ETH_ALEN];
>+static u8 macaddr_buf[ETH_ALEN];
>
> static struct vdpasim *vdpasim_net_dev;
>
>-- 
>2.26.2
>
>_______________________________________________
>Virtualization mailing list
>Virtualization@lists.linux-foundation.org
>https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>


^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-07  3:48               ` Parav Pandit
@ 2021-01-12  4:14                 ` Parav Pandit
  2021-01-14  4:17                 ` Jason Wang
  1 sibling, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-12  4:14 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin; +Cc: netdev, Eli Cohen, virtualization

Hi Michael,

> From: Virtualization <virtualization-bounces@lists.linux-foundation.org> On
> Behalf Of Parav Pandit
> > >
> > > When we add mac address attribute in add command, at that point also
> > remove the module parameter macaddr.
> >
> > Will that be mandatory? I'm not to happy with a UAPI we intend to
> > break straight away ...
> No. Specifying mac address shouldn't be mandatory. UAPI wont' be broken.

Shall we please proceed with this patchset?
I would like to complete the iproute2 part and converting remaining two drivers to follow mgmt tool subsequent to this series.


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-07  3:48               ` Parav Pandit
  2021-01-12  4:14                 ` Parav Pandit
@ 2021-01-14  4:17                 ` Jason Wang
  2021-01-14  7:58                   ` Parav Pandit
  1 sibling, 1 reply; 79+ messages in thread
From: Jason Wang @ 2021-01-14  4:17 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin; +Cc: virtualization, Eli Cohen, netdev


On 2021/1/7 上午11:48, Parav Pandit wrote:
>
>> From: Michael S. Tsirkin <mst@redhat.com>
>> Sent: Tuesday, January 5, 2021 6:53 PM
>>
>> On Tue, Jan 05, 2021 at 12:30:15PM +0000, Parav Pandit wrote:
>>>
>>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>> Sent: Tuesday, January 5, 2021 5:45 PM
>>>>
>>>> On Tue, Jan 05, 2021 at 12:02:33PM +0000, Parav Pandit wrote:
>>>>>
>>>>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>>>> Sent: Tuesday, January 5, 2021 5:19 PM
>>>>>>
>>>>>> On Tue, Jan 05, 2021 at 12:32:03PM +0200, Parav Pandit wrote:
>>>>>>> Enable user to create vdpasim net simulate devices.
>>>>>>>
>>>>>>>
>>>>>>> $ vdpa dev add mgmtdev vdpasim_net name foo2
>>>>>>>
>>>>>>> Show the newly created vdpa device by its name:
>>>>>>> $ vdpa dev show foo2
>>>>>>> foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2
>>>>>>> max_vq_size 256
>>>>>>>
>>>>>>> $ vdpa dev show foo2 -jp
>>>>>>> {
>>>>>>>      "dev": {
>>>>>>>          "foo2": {
>>>>>>>              "type": "network",
>>>>>>>              "mgmtdev": "vdpasim_net",
>>>>>>>              "vendor_id": 0,
>>>>>>>              "max_vqs": 2,
>>>>>>>              "max_vq_size": 256
>>>>>>>          }
>>>>>>>      }
>>>>>>> }
>>>>>>
>>>>>> I'd like an example of how do device specific (e.g. net
>>>>>> specific) interfaces tie in to this.
>>>>> Not sure I follow your question.
>>>>> Do you mean how to set mac address or mtu of this vdpa device of
>>>>> type
>>>> net?
>>>>> If so, dev add command will be extended shortly in subsequent
>>>>> series to
>>>> set this net specific attributes.
>>>>> (I did mention in the next steps in cover letter).
>>>>>
>>>>>>> +static int __init vdpasim_net_init(void) {
>>>>>>> +	int ret;
>>>>>>> +
>>>>>>> +	if (macaddr) {
>>>>>>> +		mac_pton(macaddr, macaddr_buf);
>>>>>>> +		if (!is_valid_ether_addr(macaddr_buf))
>>>>>>> +			return -EADDRNOTAVAIL;
>>>>>>> +	} else {
>>>>>>> +		eth_random_addr(macaddr_buf);
>>>>>>>   	}
>>>>>> Hmm so all devices start out with the same MAC until changed?
>>>>>> And how is the change effected?
>>>>> Post this patchset and post we have iproute2 vdpa in the tree,
>>>>> will add the
>>>> mac address as the input attribute during "vdpa dev add" command.
>>>>> So that each different vdpa device can have user specified
>>>>> (different) mac
>>>> address.
>>>>
>>>> For now maybe just avoid VIRTIO_NET_F_MAC then for new devices
>> then?
>>> That would require book keeping existing net vdpa_sim devices created to
>> avoid setting VIRTIO_NET_F_MAC.
>>> Such book keeping code will be short lived anyway.
>>> Not sure if its worth it.
>>> Until now only one device was created. So not sure two vdpa devices with
>> same mac address will be a real issue.
>>> When we add mac address attribute in add command, at that point also
>> remove the module parameter macaddr.
>>
>> Will that be mandatory? I'm not to happy with a UAPI we intend to break
>> straight away ...
> No. Specifying mac address shouldn't be mandatory. UAPI wont' be broken.


If it's not mandatory. Does it mean the vDPA parent need to use its own 
logic to generate a validate mac? I'm not sure this is what management 
(libvirt want).

Thanks


>


^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-14  4:17                 ` Jason Wang
@ 2021-01-14  7:58                   ` Parav Pandit
  2021-01-15  5:38                     ` Jason Wang
  0 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-14  7:58 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin; +Cc: virtualization, Eli Cohen, netdev



> From: Jason Wang <jasowang@redhat.com>
> Sent: Thursday, January 14, 2021 9:48 AM
> 
> On 2021/1/7 上午11:48, Parav Pandit wrote:
> >
> >> From: Michael S. Tsirkin <mst@redhat.com>
> >> Sent: Tuesday, January 5, 2021 6:53 PM
> >>
> >> On Tue, Jan 05, 2021 at 12:30:15PM +0000, Parav Pandit wrote:
> >>>
> >>>> From: Michael S. Tsirkin <mst@redhat.com>
> >>>> Sent: Tuesday, January 5, 2021 5:45 PM
> >>>>
> >>>> On Tue, Jan 05, 2021 at 12:02:33PM +0000, Parav Pandit wrote:
> >>>>>
> >>>>>> From: Michael S. Tsirkin <mst@redhat.com>
> >>>>>> Sent: Tuesday, January 5, 2021 5:19 PM
> >>>>>>
> >>>>>> On Tue, Jan 05, 2021 at 12:32:03PM +0200, Parav Pandit wrote:
> >>>>>>> Enable user to create vdpasim net simulate devices.
> >>>>>>>
> >>>>>>>
> >>>>>>> $ vdpa dev add mgmtdev vdpasim_net name foo2
> >>>>>>>
> >>>>>>> Show the newly created vdpa device by its name:
> >>>>>>> $ vdpa dev show foo2
> >>>>>>> foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2
> >>>>>>> max_vq_size 256
> >>>>>>>
> >>>>>>> $ vdpa dev show foo2 -jp
> >>>>>>> {
> >>>>>>>      "dev": {
> >>>>>>>          "foo2": {
> >>>>>>>              "type": "network",
> >>>>>>>              "mgmtdev": "vdpasim_net",
> >>>>>>>              "vendor_id": 0,
> >>>>>>>              "max_vqs": 2,
> >>>>>>>              "max_vq_size": 256
> >>>>>>>          }
> >>>>>>>      }
> >>>>>>> }
> >>>>>>
> >>>>>> I'd like an example of how do device specific (e.g. net
> >>>>>> specific) interfaces tie in to this.
> >>>>> Not sure I follow your question.
> >>>>> Do you mean how to set mac address or mtu of this vdpa device of
> >>>>> type
> >>>> net?
> >>>>> If so, dev add command will be extended shortly in subsequent
> >>>>> series to
> >>>> set this net specific attributes.
> >>>>> (I did mention in the next steps in cover letter).
> >>>>>
> >>>>>>> +static int __init vdpasim_net_init(void) {
> >>>>>>> +	int ret;
> >>>>>>> +
> >>>>>>> +	if (macaddr) {
> >>>>>>> +		mac_pton(macaddr, macaddr_buf);
> >>>>>>> +		if (!is_valid_ether_addr(macaddr_buf))
> >>>>>>> +			return -EADDRNOTAVAIL;
> >>>>>>> +	} else {
> >>>>>>> +		eth_random_addr(macaddr_buf);
> >>>>>>>   	}
> >>>>>> Hmm so all devices start out with the same MAC until changed?
> >>>>>> And how is the change effected?
> >>>>> Post this patchset and post we have iproute2 vdpa in the tree,
> >>>>> will add the
> >>>> mac address as the input attribute during "vdpa dev add" command.
> >>>>> So that each different vdpa device can have user specified
> >>>>> (different) mac
> >>>> address.
> >>>>
> >>>> For now maybe just avoid VIRTIO_NET_F_MAC then for new devices
> >> then?
> >>> That would require book keeping existing net vdpa_sim devices
> >>> created to
> >> avoid setting VIRTIO_NET_F_MAC.
> >>> Such book keeping code will be short lived anyway.
> >>> Not sure if its worth it.
> >>> Until now only one device was created. So not sure two vdpa devices
> >>> with
> >> same mac address will be a real issue.
> >>> When we add mac address attribute in add command, at that point also
> >> remove the module parameter macaddr.
> >>
> >> Will that be mandatory? I'm not to happy with a UAPI we intend to
> >> break straight away ...
> > No. Specifying mac address shouldn't be mandatory. UAPI wont' be
> broken.
> 
> 
> If it's not mandatory. Does it mean the vDPA parent need to use its own logic
> to generate a validate mac? I'm not sure this is what management (libvirt
> want).
> 
There are few use cases that I see with PFs, VFs and SFs supporting vdpa devices.

1. User wants to use the VF only for vdpa purpose. Here user got the VF which was pre-setup by the sysadmin.
In this case whatever MAC assigned to the VF can be used by its vdpa device.
Here, user doesn't need to pass the mac address during vdpa device creation time.
This is done as the same MAC has been setup in the ACL rules on the switch side.
Non VDPA users of a VF typically use the VF this way for Netdev and rdma functionality.
They might continue same way for vdpa application as well.
Here VF mac is either set using 
(a) devlink port function set hw_addr command or using 
(b) ip link set vf mac 
So vdpa tool didn't pass the mac. (optional).
Though VIRTIO_NET_F_MAC is still valid.

2. User may want to create one or more vdpa device out of the mgmt. device.
Here user wants to more/full control of all features, overriding what sysadmin has setup as MAC of the VF/SF.
In this case user will specify the MAC via mgmt tool.
(a) This is also used by those vdpa devices which doesn't have eswitch offloads.
(b) This will work with eswitch offloads as well who does source learning.
(c) User chose to use the vdpa device of a VF while VF Netdev and rdma device are used by hypervisor for something else as well.
VIRTIO_NET_F_MAC remains valid in all 2.{a,b,c}.

3. A  vendor mgmt. device always expects it user to provide mac for its vdpa devices.
So when it is not provided, it can fail with error message string in extack or clear the VIRTIO_NET_F_MAC and let it work using virtio spec's 5.1.5 point 5 to proceed.

As common denominator of all above cases, if QEMU or user pass the MAC during creation, it will almost always work.
Advance user and QEMU with switchdev mode support who has done 1.a/1.b, will omit it.
I do not know how deep integration of QEMU exist with the switchdev mode support.

With that mac, mtu as optional input fields provide the necessary flexibility for different stacks to take appropriate shape as they desire.


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-14  7:58                   ` Parav Pandit
@ 2021-01-15  5:38                     ` Jason Wang
  2021-01-15  6:27                       ` Parav Pandit
  2021-01-18 18:03                       ` Parav Pandit
  0 siblings, 2 replies; 79+ messages in thread
From: Jason Wang @ 2021-01-15  5:38 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin; +Cc: virtualization, Eli Cohen, netdev


On 2021/1/14 下午3:58, Parav Pandit wrote:
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Thursday, January 14, 2021 9:48 AM
>>
>> On 2021/1/7 上午11:48, Parav Pandit wrote:
>>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>> Sent: Tuesday, January 5, 2021 6:53 PM
>>>>
>>>> On Tue, Jan 05, 2021 at 12:30:15PM +0000, Parav Pandit wrote:
>>>>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>>>> Sent: Tuesday, January 5, 2021 5:45 PM
>>>>>>
>>>>>> On Tue, Jan 05, 2021 at 12:02:33PM +0000, Parav Pandit wrote:
>>>>>>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>>>>>> Sent: Tuesday, January 5, 2021 5:19 PM
>>>>>>>>
>>>>>>>> On Tue, Jan 05, 2021 at 12:32:03PM +0200, Parav Pandit wrote:
>>>>>>>>> Enable user to create vdpasim net simulate devices.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> $ vdpa dev add mgmtdev vdpasim_net name foo2
>>>>>>>>>
>>>>>>>>> Show the newly created vdpa device by its name:
>>>>>>>>> $ vdpa dev show foo2
>>>>>>>>> foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2
>>>>>>>>> max_vq_size 256
>>>>>>>>>
>>>>>>>>> $ vdpa dev show foo2 -jp
>>>>>>>>> {
>>>>>>>>>       "dev": {
>>>>>>>>>           "foo2": {
>>>>>>>>>               "type": "network",
>>>>>>>>>               "mgmtdev": "vdpasim_net",
>>>>>>>>>               "vendor_id": 0,
>>>>>>>>>               "max_vqs": 2,
>>>>>>>>>               "max_vq_size": 256
>>>>>>>>>           }
>>>>>>>>>       }
>>>>>>>>> }
>>>>>>>> I'd like an example of how do device specific (e.g. net
>>>>>>>> specific) interfaces tie in to this.
>>>>>>> Not sure I follow your question.
>>>>>>> Do you mean how to set mac address or mtu of this vdpa device of
>>>>>>> type
>>>>>> net?
>>>>>>> If so, dev add command will be extended shortly in subsequent
>>>>>>> series to
>>>>>> set this net specific attributes.
>>>>>>> (I did mention in the next steps in cover letter).
>>>>>>>
>>>>>>>>> +static int __init vdpasim_net_init(void) {
>>>>>>>>> +	int ret;
>>>>>>>>> +
>>>>>>>>> +	if (macaddr) {
>>>>>>>>> +		mac_pton(macaddr, macaddr_buf);
>>>>>>>>> +		if (!is_valid_ether_addr(macaddr_buf))
>>>>>>>>> +			return -EADDRNOTAVAIL;
>>>>>>>>> +	} else {
>>>>>>>>> +		eth_random_addr(macaddr_buf);
>>>>>>>>>    	}
>>>>>>>> Hmm so all devices start out with the same MAC until changed?
>>>>>>>> And how is the change effected?
>>>>>>> Post this patchset and post we have iproute2 vdpa in the tree,
>>>>>>> will add the
>>>>>> mac address as the input attribute during "vdpa dev add" command.
>>>>>>> So that each different vdpa device can have user specified
>>>>>>> (different) mac
>>>>>> address.
>>>>>>
>>>>>> For now maybe just avoid VIRTIO_NET_F_MAC then for new devices
>>>> then?
>>>>> That would require book keeping existing net vdpa_sim devices
>>>>> created to
>>>> avoid setting VIRTIO_NET_F_MAC.
>>>>> Such book keeping code will be short lived anyway.
>>>>> Not sure if its worth it.
>>>>> Until now only one device was created. So not sure two vdpa devices
>>>>> with
>>>> same mac address will be a real issue.
>>>>> When we add mac address attribute in add command, at that point also
>>>> remove the module parameter macaddr.
>>>>
>>>> Will that be mandatory? I'm not to happy with a UAPI we intend to
>>>> break straight away ...
>>> No. Specifying mac address shouldn't be mandatory. UAPI wont' be
>> broken.
>>
>>
>> If it's not mandatory. Does it mean the vDPA parent need to use its own logic
>> to generate a validate mac? I'm not sure this is what management (libvirt
>> want).
>>
> There are few use cases that I see with PFs, VFs and SFs supporting vdpa devices.
>
> 1. User wants to use the VF only for vdpa purpose. Here user got the VF which was pre-setup by the sysadmin.
> In this case whatever MAC assigned to the VF can be used by its vdpa device.
> Here, user doesn't need to pass the mac address during vdpa device creation time.
> This is done as the same MAC has been setup in the ACL rules on the switch side.
> Non VDPA users of a VF typically use the VF this way for Netdev and rdma functionality.
> They might continue same way for vdpa application as well.
> Here VF mac is either set using
> (a) devlink port function set hw_addr command or using
> (b) ip link set vf mac
> So vdpa tool didn't pass the mac. (optional).
> Though VIRTIO_NET_F_MAC is still valid.
>
> 2. User may want to create one or more vdpa device out of the mgmt. device.
> Here user wants to more/full control of all features, overriding what sysadmin has setup as MAC of the VF/SF.
> In this case user will specify the MAC via mgmt tool.
> (a) This is also used by those vdpa devices which doesn't have eswitch offloads.
> (b) This will work with eswitch offloads as well who does source learning.
> (c) User chose to use the vdpa device of a VF while VF Netdev and rdma device are used by hypervisor for something else as well.
> VIRTIO_NET_F_MAC remains valid in all 2.{a,b,c}.
>
> 3. A  vendor mgmt. device always expects it user to provide mac for its vdpa devices.
> So when it is not provided, it can fail with error message string in extack or clear the VIRTIO_NET_F_MAC and let it work using virtio spec's 5.1.5 point 5 to proceed.
>
> As common denominator of all above cases, if QEMU or user pass the MAC during creation, it will almost always work.
> Advance user and QEMU with switchdev mode support who has done 1.a/1.b, will omit it.
> I do not know how deep integration of QEMU exist with the switchdev mode support.
>
> With that mac, mtu as optional input fields provide the necessary flexibility for different stacks to take appropriate shape as they desire.


Thanks for the clarification. I think we'd better document the above in 
the patch that introduces the mac setting from management API.


>


^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-15  5:38                     ` Jason Wang
@ 2021-01-15  6:27                       ` Parav Pandit
  2021-01-19 11:09                         ` Jason Wang
  2021-01-18 18:03                       ` Parav Pandit
  1 sibling, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-15  6:27 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin; +Cc: virtualization, Eli Cohen, netdev



> From: Jason Wang <jasowang@redhat.com>
> Sent: Friday, January 15, 2021 11:09 AM
> 
> 
> On 2021/1/14 下午3:58, Parav Pandit wrote:
> >
> >> From: Jason Wang <jasowang@redhat.com>
> >> Sent: Thursday, January 14, 2021 9:48 AM
> >>
> >> On 2021/1/7 上午11:48, Parav Pandit wrote:
> >>>> From: Michael S. Tsirkin <mst@redhat.com>
> >>>> Sent: Tuesday, January 5, 2021 6:53 PM
> >>>>
> >>>> On Tue, Jan 05, 2021 at 12:30:15PM +0000, Parav Pandit wrote:
> >>>>>> From: Michael S. Tsirkin <mst@redhat.com>
> >>>>>> Sent: Tuesday, January 5, 2021 5:45 PM
> >>>>>>
> >>>>>> On Tue, Jan 05, 2021 at 12:02:33PM +0000, Parav Pandit wrote:
> >>>>>>>> From: Michael S. Tsirkin <mst@redhat.com>
> >>>>>>>> Sent: Tuesday, January 5, 2021 5:19 PM
> >>>>>>>>
> >>>>>>>> On Tue, Jan 05, 2021 at 12:32:03PM +0200, Parav Pandit wrote:
> >>>>>>>>> Enable user to create vdpasim net simulate devices.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> $ vdpa dev add mgmtdev vdpasim_net name foo2
> >>>>>>>>>
> >>>>>>>>> Show the newly created vdpa device by its name:
> >>>>>>>>> $ vdpa dev show foo2
> >>>>>>>>> foo2: type network mgmtdev vdpasim_net vendor_id 0
> max_vqs 2
> >>>>>>>>> max_vq_size 256
> >>>>>>>>>
> >>>>>>>>> $ vdpa dev show foo2 -jp
> >>>>>>>>> {
> >>>>>>>>>       "dev": {
> >>>>>>>>>           "foo2": {
> >>>>>>>>>               "type": "network",
> >>>>>>>>>               "mgmtdev": "vdpasim_net",
> >>>>>>>>>               "vendor_id": 0,
> >>>>>>>>>               "max_vqs": 2,
> >>>>>>>>>               "max_vq_size": 256
> >>>>>>>>>           }
> >>>>>>>>>       }
> >>>>>>>>> }
> >>>>>>>> I'd like an example of how do device specific (e.g. net
> >>>>>>>> specific) interfaces tie in to this.
> >>>>>>> Not sure I follow your question.
> >>>>>>> Do you mean how to set mac address or mtu of this vdpa device of
> >>>>>>> type
> >>>>>> net?
> >>>>>>> If so, dev add command will be extended shortly in subsequent
> >>>>>>> series to
> >>>>>> set this net specific attributes.
> >>>>>>> (I did mention in the next steps in cover letter).
> >>>>>>>
> >>>>>>>>> +static int __init vdpasim_net_init(void) {
> >>>>>>>>> +	int ret;
> >>>>>>>>> +
> >>>>>>>>> +	if (macaddr) {
> >>>>>>>>> +		mac_pton(macaddr, macaddr_buf);
> >>>>>>>>> +		if (!is_valid_ether_addr(macaddr_buf))
> >>>>>>>>> +			return -EADDRNOTAVAIL;
> >>>>>>>>> +	} else {
> >>>>>>>>> +		eth_random_addr(macaddr_buf);
> >>>>>>>>>    	}
> >>>>>>>> Hmm so all devices start out with the same MAC until changed?
> >>>>>>>> And how is the change effected?
> >>>>>>> Post this patchset and post we have iproute2 vdpa in the tree,
> >>>>>>> will add the
> >>>>>> mac address as the input attribute during "vdpa dev add" command.
> >>>>>>> So that each different vdpa device can have user specified
> >>>>>>> (different) mac
> >>>>>> address.
> >>>>>>
> >>>>>> For now maybe just avoid VIRTIO_NET_F_MAC then for new devices
> >>>> then?
> >>>>> That would require book keeping existing net vdpa_sim devices
> >>>>> created to
> >>>> avoid setting VIRTIO_NET_F_MAC.
> >>>>> Such book keeping code will be short lived anyway.
> >>>>> Not sure if its worth it.
> >>>>> Until now only one device was created. So not sure two vdpa
> >>>>> devices with
> >>>> same mac address will be a real issue.
> >>>>> When we add mac address attribute in add command, at that point
> >>>>> also
> >>>> remove the module parameter macaddr.
> >>>>
> >>>> Will that be mandatory? I'm not to happy with a UAPI we intend to
> >>>> break straight away ...
> >>> No. Specifying mac address shouldn't be mandatory. UAPI wont' be
> >> broken.
> >>
> >>
> >> If it's not mandatory. Does it mean the vDPA parent need to use its
> >> own logic to generate a validate mac? I'm not sure this is what
> >> management (libvirt want).
> >>
> > There are few use cases that I see with PFs, VFs and SFs supporting vdpa
> devices.
> >
> > 1. User wants to use the VF only for vdpa purpose. Here user got the VF
> which was pre-setup by the sysadmin.
> > In this case whatever MAC assigned to the VF can be used by its vdpa
> device.
> > Here, user doesn't need to pass the mac address during vdpa device
> creation time.
> > This is done as the same MAC has been setup in the ACL rules on the switch
> side.
> > Non VDPA users of a VF typically use the VF this way for Netdev and rdma
> functionality.
> > They might continue same way for vdpa application as well.
> > Here VF mac is either set using
> > (a) devlink port function set hw_addr command or using
> > (b) ip link set vf mac
> > So vdpa tool didn't pass the mac. (optional).
> > Though VIRTIO_NET_F_MAC is still valid.
> >
> > 2. User may want to create one or more vdpa device out of the mgmt.
> device.
> > Here user wants to more/full control of all features, overriding what
> sysadmin has setup as MAC of the VF/SF.
> > In this case user will specify the MAC via mgmt tool.
> > (a) This is also used by those vdpa devices which doesn't have eswitch
> offloads.
> > (b) This will work with eswitch offloads as well who does source learning.
> > (c) User chose to use the vdpa device of a VF while VF Netdev and rdma
> device are used by hypervisor for something else as well.
> > VIRTIO_NET_F_MAC remains valid in all 2.{a,b,c}.
> >
> > 3. A  vendor mgmt. device always expects it user to provide mac for its
> vdpa devices.
> > So when it is not provided, it can fail with error message string in extack or
> clear the VIRTIO_NET_F_MAC and let it work using virtio spec's 5.1.5 point 5
> to proceed.
> >
> > As common denominator of all above cases, if QEMU or user pass the MAC
> during creation, it will almost always work.
> > Advance user and QEMU with switchdev mode support who has done
> 1.a/1.b, will omit it.
> > I do not know how deep integration of QEMU exist with the switchdev
> mode support.
> >
> > With that mac, mtu as optional input fields provide the necessary flexibility
> for different stacks to take appropriate shape as they desire.
> 
> 
> Thanks for the clarification. I think we'd better document the above in the
> patch that introduces the mac setting from management API.

Yes. Will do.
Thanks.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-15  5:38                     ` Jason Wang
  2021-01-15  6:27                       ` Parav Pandit
@ 2021-01-18 18:03                       ` Parav Pandit
  2021-01-20  7:53                         ` Michael S. Tsirkin
  1 sibling, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-18 18:03 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin; +Cc: virtualization, Eli Cohen, netdev

Hi Michael, Jason,

> From: Jason Wang <jasowang@redhat.com>
> Sent: Friday, January 15, 2021 11:09 AM
> 
> 
> Thanks for the clarification. I think we'd better document the above in the
> patch that introduces the mac setting from management API.

Can we proceed with this patchset?
We like to progress next to iproute2/vdpa, mac and other drivers post this series in this kernel version.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH 0/7] Introduce vdpa management tool
  2020-12-08 22:47   ` David Ahern
@ 2021-01-19  4:21     ` Parav Pandit
  0 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-19  4:21 UTC (permalink / raw)
  To: David Ahern, Jason Wang, virtualization, Stephen Hemminger
  Cc: mst, Eli Cohen, netdev, 谢永吉

Hi David,

> From: David Ahern <dsahern@gmail.com>
> Sent: Wednesday, December 9, 2020 4:17 AM
> 
> On 11/26/20 8:53 PM, Jason Wang wrote:
> > 1. Where does userspace vdpa tool reside which users can use?
> > Ans: vdpa tool can possibly reside in iproute2 [1] as it enables user
> > to create vdpa net devices.
> 
> iproute2 package is fine with us, but there are some expectations:
> syntax, command options and documentation need to be consistent with
> other iproute2 commands (this thread suggests it will be but just being clear),
> and it needs to re-use code as much as possible (e.g., json functions). If there
> is overlap with other tools (devlink, dcb, etc), you should refactor into
> common code used by all. Petr Machata has done this quite a bit for dcb and
> is a good example to follow.

Sorry for my late reply. I missed your message until yesterday.
Thanks for the ack and inputs.
Yes, I migrated the iproute2/vdpa to use now uses the common code introduced by dcb tool.
Waiting for kernel side to finish.


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-15  6:27                       ` Parav Pandit
@ 2021-01-19 11:09                         ` Jason Wang
  2021-01-20  3:21                           ` Parav Pandit
  0 siblings, 1 reply; 79+ messages in thread
From: Jason Wang @ 2021-01-19 11:09 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin
  Cc: virtualization, Eli Cohen, netdev, Sean Mooney


On 2021/1/15 下午2:27, Parav Pandit wrote:
>>> With that mac, mtu as optional input fields provide the necessary flexibility
>> for different stacks to take appropriate shape as they desire.
>>
>>
>> Thanks for the clarification. I think we'd better document the above in the
>> patch that introduces the mac setting from management API.
> Yes. Will do.
> Thanks.


Adding Sean.

Regarding to mac address setting. Do we plan to allow to modify mac 
address after the creation? It looks like Openstack wants this.

Sean may share more information on this.

Thanks


^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-19 11:09                         ` Jason Wang
@ 2021-01-20  3:21                           ` Parav Pandit
  2021-01-20  3:46                             ` Parav Pandit
  0 siblings, 1 reply; 79+ messages in thread
From: Parav Pandit @ 2021-01-20  3:21 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: virtualization, Eli Cohen, netdev, Sean Mooney

> From: Jason Wang <jasowang@redhat.com>
> Sent: Tuesday, January 19, 2021 4:39 PM
> To: Parav Pandit <parav@nvidia.com>; Michael S. Tsirkin <mst@redhat.com>
> Cc: virtualization@lists.linux-foundation.org; Eli Cohen <elic@nvidia.com>;
> netdev@vger.kernel.org; Sean Mooney <smooney@redhat.com>
> Subject: Re: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user
> supported devices
> 
> 
> On 2021/1/15 下午2:27, Parav Pandit wrote:
> >>> With that mac, mtu as optional input fields provide the necessary
> >>> flexibility
> >> for different stacks to take appropriate shape as they desire.
> >>
> >>
> >> Thanks for the clarification. I think we'd better document the above
> >> in the patch that introduces the mac setting from management API.
> > Yes. Will do.
> > Thanks.
> 
> 
> Adding Sean.
> 
> Regarding to mac address setting. Do we plan to allow to modify mac
> address after the creation? It looks like Openstack wants this.
>
Mac address is exposed in the features so yes, it should be possible to modify it as part of features modify command. (in future).
User needs to make sure that device is not attached to vhost or higher layer stack when device configuration layout is modified.
 
> Sean may share more information on this.
> 
> Thanks


^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-20  3:21                           ` Parav Pandit
@ 2021-01-20  3:46                             ` Parav Pandit
  0 siblings, 0 replies; 79+ messages in thread
From: Parav Pandit @ 2021-01-20  3:46 UTC (permalink / raw)
  To: Parav Pandit, Jason Wang, Michael S. Tsirkin
  Cc: netdev, Eli Cohen, Sean Mooney, virtualization



> From: Virtualization <virtualization-bounces@lists.linux-foundation.org> On
> Behalf Of Parav Pandit
> 
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Tuesday, January 19, 2021 4:39 PM
> > To: Parav Pandit <parav@nvidia.com>; Michael S. Tsirkin
> > <mst@redhat.com>
> > Cc: virtualization@lists.linux-foundation.org; Eli Cohen
> > <elic@nvidia.com>; netdev@vger.kernel.org; Sean Mooney
> > <smooney@redhat.com>
> > Subject: Re: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for
> > user supported devices
> >
> >
> > On 2021/1/15 下午2:27, Parav Pandit wrote:
> > >>> With that mac, mtu as optional input fields provide the necessary
> > >>> flexibility
> > >> for different stacks to take appropriate shape as they desire.
> > >>
> > >>
> > >> Thanks for the clarification. I think we'd better document the
> > >> above in the patch that introduces the mac setting from management
> API.
> > > Yes. Will do.
> > > Thanks.
> >
> >
> > Adding Sean.
> >
> > Regarding to mac address setting. Do we plan to allow to modify mac
> > address after the creation? It looks like Openstack wants this.
> >
> Mac address is exposed in the features so yes, it should be possible to
> modify it as part of features modify command. (in future).
> User needs to make sure that device is not attached to vhost or higher layer
> stack when device configuration layout is modified.
> 
Just curious, why Openstack cannot set the mac address at device creation time?
Is vdpa device created by non Openstack software?
Does it always want to set the mac address of vdpa device?

> > Sean may share more information on this.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices
  2021-01-18 18:03                       ` Parav Pandit
@ 2021-01-20  7:53                         ` Michael S. Tsirkin
  0 siblings, 0 replies; 79+ messages in thread
From: Michael S. Tsirkin @ 2021-01-20  7:53 UTC (permalink / raw)
  To: Parav Pandit; +Cc: Jason Wang, virtualization, Eli Cohen, netdev

On Mon, Jan 18, 2021 at 06:03:57PM +0000, Parav Pandit wrote:
> Hi Michael, Jason,
> 
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Friday, January 15, 2021 11:09 AM
> > 
> > 
> > Thanks for the clarification. I think we'd better document the above in the
> > patch that introduces the mac setting from management API.
> 
> Can we proceed with this patchset?
> We like to progress next to iproute2/vdpa, mac and other drivers post this series in this kernel version.

Let me put this in next so it can get some testing there for a week or
so.


^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, back to index

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-12  6:39 [PATCH 0/7] Introduce vdpa management tool Parav Pandit
2020-11-12  6:39 ` [PATCH 1/7] vdpa: Add missing comment for virtqueue count Parav Pandit
2020-11-12  6:40 ` [PATCH 2/7] vdpa: Use simpler version of ida allocation Parav Pandit
2020-11-12  6:40 ` [PATCH 3/7] vdpa: Extend routine to accept vdpa device name Parav Pandit
2020-11-12  6:40 ` [PATCH 4/7] vdpa: Define vdpa parent device, ops and a netlink interface Parav Pandit
2020-11-12  6:40 ` [PATCH 5/7] vdpa: Enable a user to add and delete a vdpa device Parav Pandit
2020-11-12  6:40 ` [PATCH 6/7] vdpa: Enable user to query vdpa device info Parav Pandit
2020-11-12  6:40 ` [PATCH 7/7] vdpa/vdpa_sim: Enable user to create vdpasim net devices Parav Pandit
2020-11-16  9:41 ` [PATCH 0/7] Introduce vdpa management tool Stefan Hajnoczi
2020-11-17 19:41   ` Parav Pandit
2020-11-16 22:23 ` Jakub Kicinski
2020-11-17 19:51   ` Parav Pandit
2020-12-16  9:13     ` Michael S. Tsirkin
     [not found]       ` <20201216080610.08541f44@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
2020-12-16 16:54         ` Parav Pandit
2020-12-16 19:57           ` Michael S. Tsirkin
2020-12-17 12:13             ` Parav Pandit
2020-11-27  3:53 ` Jason Wang
     [not found]   ` <CACycT3sYScObb9nN3g7L3cesjE7sCZWxZ5_5R1usGU9ePZEeqA@mail.gmail.com>
2020-11-30  3:36     ` [External] " Jason Wang
2020-11-30  7:07       ` Yongji Xie
2020-12-01  6:25         ` Jason Wang
2020-12-01  9:55           ` Yongji Xie
2020-12-01 11:32             ` Parav Pandit
2020-12-01 14:18               ` Yongji Xie
2020-12-01 15:58                 ` Parav Pandit
2020-12-02  3:29                   ` Yongji Xie
2020-12-02  4:53                     ` Parav Pandit
2020-12-02  5:51                       ` Jason Wang
2020-12-02  6:24                         ` Parav Pandit
2020-12-02  7:55                           ` Jason Wang
2020-12-02  9:27                         ` Yongji Xie
2020-12-02  9:21                       ` Yongji Xie
2020-12-02 11:13                         ` Parav Pandit
2020-12-02 13:18                           ` Yongji Xie
2020-12-02  5:48             ` Jason Wang
2020-12-08 22:47   ` David Ahern
2021-01-19  4:21     ` Parav Pandit
2020-12-16  9:16 ` Michael S. Tsirkin
2021-01-04  3:31 ` [PATCH linux-next v2 " Parav Pandit
2021-01-04  3:31   ` [PATCH linux-next v2 1/7] vdpa_sim_net: Make mac address array static Parav Pandit
2021-01-04  7:00     ` Jason Wang
2021-01-04  3:31   ` [PATCH linux-next v2 2/7] vdpa_sim_net: Add module param to disable default vdpa net device Parav Pandit
2021-01-04  3:31   ` [PATCH linux-next v2 3/7] vdpa: Extend routine to accept vdpa device name Parav Pandit
2021-01-04  3:31   ` [PATCH linux-next v2 4/7] vdpa: Define vdpa mgmt device, ops and a netlink interface Parav Pandit
2021-01-04  7:03     ` Jason Wang
2021-01-04  7:24       ` Parav Pandit
2021-01-05  4:10         ` Jason Wang
2021-01-05  6:33           ` Parav Pandit
2021-01-05  8:36             ` Jason Wang
2021-01-04  3:31   ` [PATCH linux-next v2 5/7] vdpa: Enable a user to add and delete a vdpa device Parav Pandit
2021-01-04  3:31   ` [PATCH linux-next v2 6/7] vdpa: Enable user to query vdpa device info Parav Pandit
2021-01-04  3:31   ` [PATCH linux-next v2 7/7] vdpa_sim_net: Add support for user supported devices Parav Pandit
2021-01-04  7:05     ` Jason Wang
2021-01-04  7:21       ` Parav Pandit
2021-01-05  4:06         ` Jason Wang
2021-01-05  6:22           ` Parav Pandit
2021-01-05 10:31 ` [PATCH linux-next v3 0/6] Introduce vdpa management tool Parav Pandit
2021-01-05 10:31   ` [PATCH linux-next v3 1/6] vdpa_sim_net: Make mac address array static Parav Pandit
2021-01-07 13:45     ` Stefano Garzarella
2021-01-05 10:31   ` [PATCH linux-next v3 2/6] vdpa: Extend routine to accept vdpa device name Parav Pandit
2021-01-05 10:32   ` [PATCH linux-next v3 3/6] vdpa: Define vdpa mgmt device, ops and a netlink interface Parav Pandit
2021-01-05 10:32   ` [PATCH linux-next v3 4/6] vdpa: Enable a user to add and delete a vdpa device Parav Pandit
2021-01-05 10:32   ` [PATCH linux-next v3 5/6] vdpa: Enable user to query vdpa device info Parav Pandit
2021-01-05 10:32   ` [PATCH linux-next v3 6/6] vdpa_sim_net: Add support for user supported devices Parav Pandit
2021-01-05 11:48     ` Michael S. Tsirkin
2021-01-05 12:02       ` Parav Pandit
2021-01-05 12:14         ` Michael S. Tsirkin
2021-01-05 12:30           ` Parav Pandit
2021-01-05 13:23             ` Michael S. Tsirkin
2021-01-07  3:48               ` Parav Pandit
2021-01-12  4:14                 ` Parav Pandit
2021-01-14  4:17                 ` Jason Wang
2021-01-14  7:58                   ` Parav Pandit
2021-01-15  5:38                     ` Jason Wang
2021-01-15  6:27                       ` Parav Pandit
2021-01-19 11:09                         ` Jason Wang
2021-01-20  3:21                           ` Parav Pandit
2021-01-20  3:46                             ` Parav Pandit
2021-01-18 18:03                       ` Parav Pandit
2021-01-20  7:53                         ` Michael S. Tsirkin

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git