All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] allow procinfo and pdump on eth vdev
@ 2018-04-05 17:44 Jianfeng Tan
  2018-04-05 17:44 ` [PATCH v2 1/5] eal: bring forward multi-process channel init Jianfeng Tan
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Jianfeng Tan @ 2018-04-05 17:44 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

v2:
  - Add spinlock for vdev device list as suggested by Anatoly.
  - Add ring, cxgbe and remove the free in each PMDs as suggested by Matan.
  - Rebase on master.

As we know, we have below limitations in vdev:
  - dpdk-procinfo cannot get the stats of (most) vdev in primary process;
  - dpdk-pdump cannot dump the packets for (most) vdev in primary proces;
  - secondary process cannot use (most) vdev in primary process.

The very first reason is that the secondary process actually does not know
the existence of those vdevs as vdevs are chained on a linked list, and
not shareable to secondary.

In this patch series, we would like to propose a vdev sharing model like this:
  - As a secondary process boots, all devices (including vdev) in primary
    will be automatically shared. After both primary and secondary process
    booted,
  - Device add/remove in primary will be translated to device hog plug/unplug
    event in secondary processes. (TODO)
  - Device add in secondary
    * If that kind of device support multi-process, the secondary will
      request the primary to probe the device and the primary to share
      it to the secondary. It's not necessary to have secondary-private
      device in this case. (TODO)
    * If that kind of device does not support multi-process, the secondary
      will probe the device by itself, and the port id is shared among
      all primary/secondary processes.

This patch series don't:
  - provide secondary data path (Rx/Tx) support for each specific vdev.

How to test:

Step 0: start testpmd with a vhost port; and a VM connected to the vhost port.

Step 1: try using dpdk-procinfo to get the stats.
 $(dpdk-procinfo) --log-level=8 --no-pci -- --stats

Step 2: try using dpdk-pdump to dump the packets.
 $(dpdk-pdump) -- --pdump 'port=0,queue=*,rx-dev=/tmp/rx.pcap'

Jianfeng Tan (5):
  eal: bring forward multi-process channel init
  bus/vdev: add lock on vdev device list
  bus/vdev: bus scan by multi-process channel
  drivers/net: not use private eth dev data
  drivers/net: share vdev data to secondary process

 drivers/bus/vdev/Makefile                 |   1 +
 drivers/bus/vdev/vdev.c                   | 187 ++++++++++++++++++++++++++----
 drivers/net/af_packet/rte_eth_af_packet.c |  43 +++----
 drivers/net/bonding/rte_eth_bond_pmd.c    |  13 +++
 drivers/net/cxgbe/cxgbe_main.c            |   1 -
 drivers/net/failsafe/failsafe.c           |  14 +++
 drivers/net/kni/rte_eth_kni.c             |  26 +++--
 drivers/net/null/rte_eth_null.c           |  32 ++---
 drivers/net/octeontx/octeontx_ethdev.c    |  29 ++---
 drivers/net/pcap/rte_eth_pcap.c           |  32 ++---
 drivers/net/ring/rte_eth_ring.c           |  17 +--
 drivers/net/softnic/rte_eth_softnic.c     |  19 ++-
 drivers/net/tap/rte_eth_tap.c             |  24 ++--
 drivers/net/vhost/rte_eth_vhost.c         |  36 +++---
 lib/librte_eal/bsdapp/eal/eal.c           |  23 ++--
 lib/librte_eal/linuxapp/eal/eal.c         |  23 ++--
 16 files changed, 354 insertions(+), 166 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/5] eal: bring forward multi-process channel init
  2018-04-05 17:44 [PATCH v2 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
@ 2018-04-05 17:44 ` Jianfeng Tan
  2018-04-05 17:45 ` [PATCH v2 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Jianfeng Tan @ 2018-04-05 17:44 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

Adjust the init sequence: put mp channel init before bus scan
so that we can init the vdev bus through mp channel in the
secondary process before the bus scan.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal.c   | 23 +++++++++++++----------
 lib/librte_eal/linuxapp/eal/eal.c | 23 +++++++++++++----------
 2 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 4eafcb5..b469382 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -544,6 +544,19 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	rte_config_init();
+
+	/* Put mp channel init before bus scan so that we can init the vdev
+	 * bus through mp channel in the secondary process before the bus scan.
+	 */
+	if (rte_mp_channel_init() < 0) {
+		rte_eal_init_alert("failed to init mp channel\n");
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			rte_errno = EFAULT;
+			return -1;
+		}
+	}
+
 	if (rte_bus_scan()) {
 		rte_eal_init_alert("Cannot scan the buses for devices\n");
 		rte_errno = ENODEV;
@@ -583,16 +596,6 @@ rte_eal_init(int argc, char **argv)
 
 	rte_srand(rte_rdtsc());
 
-	rte_config_init();
-
-	if (rte_mp_channel_init() < 0) {
-		rte_eal_init_alert("failed to init mp channel\n");
-		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			rte_errno = EFAULT;
-			return -1;
-		}
-	}
-
 	if (rte_eal_memory_init() < 0) {
 		rte_eal_init_alert("Cannot init memory\n");
 		rte_errno = ENOMEM;
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index da005dd..d0edf77 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -765,6 +765,19 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	rte_config_init();
+
+	/* Put mp channel init before bus scan so that we can init the vdev
+	 * bus through mp channel in the secondary process before the bus scan.
+	 */
+	if (rte_mp_channel_init() < 0) {
+		rte_eal_init_alert("failed to init mp channel\n");
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			rte_errno = EFAULT;
+			return -1;
+		}
+	}
+
 	if (rte_bus_scan()) {
 		rte_eal_init_alert("Cannot scan the buses for devices\n");
 		rte_errno = ENODEV;
@@ -811,8 +824,6 @@ rte_eal_init(int argc, char **argv)
 
 	rte_srand(rte_rdtsc());
 
-	rte_config_init();
-
 	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
 		rte_eal_init_alert("Cannot init logging.");
 		rte_errno = ENOMEM;
@@ -820,14 +831,6 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_mp_channel_init() < 0) {
-		rte_eal_init_alert("failed to init mp channel\n");
-		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			rte_errno = EFAULT;
-			return -1;
-		}
-	}
-
 #ifdef VFIO_PRESENT
 	if (rte_eal_vfio_setup() < 0) {
 		rte_eal_init_alert("Cannot init VFIO\n");
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/5] bus/vdev: add lock on vdev device list
  2018-04-05 17:44 [PATCH v2 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
  2018-04-05 17:44 ` [PATCH v2 1/5] eal: bring forward multi-process channel init Jianfeng Tan
@ 2018-04-05 17:45 ` Jianfeng Tan
  2018-04-05 17:45 ` [PATCH v2 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Jianfeng Tan @ 2018-04-05 17:45 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

As we could add virtual devices from different threads now, we
add a spin lock to protect the vdev device list.

Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
---
 drivers/bus/vdev/vdev.c | 61 +++++++++++++++++++++++++++++++++++++------------
 1 file changed, 47 insertions(+), 14 deletions(-)

diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 7eae319..1d1c642 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -61,6 +61,8 @@ TAILQ_HEAD(vdev_device_list, rte_vdev_device);
 
 static struct vdev_device_list vdev_device_list =
 	TAILQ_HEAD_INITIALIZER(vdev_device_list);
+static rte_spinlock_t vdev_device_list_lock = RTE_SPINLOCK_INITIALIZER;
+
 struct vdev_driver_list vdev_driver_list =
 	TAILQ_HEAD_INITIALIZER(vdev_driver_list);
 
@@ -177,6 +179,7 @@ vdev_probe_all_drivers(struct rte_vdev_device *dev)
 	return ret;
 }
 
+/* The caller shall be responsible for thread-safe */
 static struct rte_vdev_device *
 find_vdev(const char *name)
 {
@@ -231,10 +234,6 @@ rte_vdev_init(const char *name, const char *args)
 	if (name == NULL)
 		return -EINVAL;
 
-	dev = find_vdev(name);
-	if (dev)
-		return -EEXIST;
-
 	devargs = alloc_devargs(name, args);
 	if (!devargs)
 		return -ENOMEM;
@@ -249,16 +248,28 @@ rte_vdev_init(const char *name, const char *args)
 	dev->device.numa_node = SOCKET_ID_ANY;
 	dev->device.name = devargs->name;
 
+	rte_spinlock_lock(&vdev_device_list_lock);
+	if (find_vdev(name)) {
+		rte_spinlock_unlock(&vdev_device_list_lock);
+		ret = -EEXIST;
+		goto fail;
+	}
+	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
+	rte_spinlock_unlock(&vdev_device_list_lock);
+
 	ret = vdev_probe_all_drivers(dev);
 	if (ret) {
 		if (ret > 0)
 			VDEV_LOG(ERR, "no driver found for %s\n", name);
+		/* If fails, remove it from vdev list */
+		rte_spinlock_lock(&vdev_device_list_lock);
+		TAILQ_REMOVE(&vdev_device_list, dev, next);
+		rte_spinlock_unlock(&vdev_device_list_lock);
 		goto fail;
 	}
 
 	TAILQ_INSERT_TAIL(&devargs_list, devargs, next);
 
-	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
 	return 0;
 
 fail:
@@ -294,17 +305,25 @@ rte_vdev_uninit(const char *name)
 	if (name == NULL)
 		return -EINVAL;
 
+	rte_spinlock_lock(&vdev_device_list_lock);
 	dev = find_vdev(name);
-	if (!dev)
+	if (!dev) {
+		rte_spinlock_unlock(&vdev_device_list_lock);
 		return -ENOENT;
+	}
+	TAILQ_REMOVE(&vdev_device_list, dev, next);
+	rte_spinlock_unlock(&vdev_device_list_lock);
 
 	devargs = dev->device.devargs;
 
 	ret = vdev_remove_driver(dev);
-	if (ret)
+	if (ret) {
+		/* If fails, add back to vdev list */
+		rte_spinlock_lock(&vdev_device_list_lock);
+		TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
+		rte_spinlock_unlock(&vdev_device_list_lock);
 		return ret;
-
-	TAILQ_REMOVE(&vdev_device_list, dev, next);
+	}
 
 	TAILQ_REMOVE(&devargs_list, devargs, next);
 
@@ -342,19 +361,25 @@ vdev_scan(void)
 		if (devargs->bus != &rte_vdev_bus)
 			continue;
 
-		dev = find_vdev(devargs->name);
-		if (dev)
-			continue;
-
 		dev = calloc(1, sizeof(*dev));
 		if (!dev)
 			return -1;
 
+		rte_spinlock_lock(&vdev_device_list_lock);
+
+		if (find_vdev(devargs->name)) {
+			rte_spinlock_unlock(&vdev_device_list_lock);
+			free(dev);
+			continue;
+		}
+
 		dev->device.devargs = devargs;
 		dev->device.numa_node = SOCKET_ID_ANY;
 		dev->device.name = devargs->name;
 
 		TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
+
+		rte_spinlock_unlock(&vdev_device_list_lock);
 	}
 
 	return 0;
@@ -368,6 +393,10 @@ vdev_probe(void)
 
 	/* call the init function for each virtual device */
 	TAILQ_FOREACH(dev, &vdev_device_list, next) {
+		/* we don't use the vdev lock here, as it's only used in DPDK
+		 * initialization; and we don't want to hold such a lock when
+		 * we call each driver probe.
+		 */
 
 		if (dev->device.driver)
 			continue;
@@ -388,14 +417,18 @@ vdev_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
 {
 	struct rte_vdev_device *dev;
 
+	rte_spinlock_lock(&vdev_device_list_lock);
 	TAILQ_FOREACH(dev, &vdev_device_list, next) {
 		if (start && &dev->device == start) {
 			start = NULL;
 			continue;
 		}
-		if (cmp(&dev->device, data) == 0)
+		if (cmp(&dev->device, data) == 0) {
+			rte_spinlock_unlock(&vdev_device_list_lock);
 			return &dev->device;
+		}
 	}
+	rte_spinlock_unlock(&vdev_device_list_lock);
 	return NULL;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/5] bus/vdev: bus scan by multi-process channel
  2018-04-05 17:44 [PATCH v2 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
  2018-04-05 17:44 ` [PATCH v2 1/5] eal: bring forward multi-process channel init Jianfeng Tan
  2018-04-05 17:45 ` [PATCH v2 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
@ 2018-04-05 17:45 ` Jianfeng Tan
  2018-04-05 17:45 ` [PATCH v2 4/5] drivers/net: not use private eth dev data Jianfeng Tan
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Jianfeng Tan @ 2018-04-05 17:45 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

To scan the vdevs in primary, we send request to primary process
to obtain the names for vdevs.

Only the name is shared from the primary. In probe(), the device
driver is supposed to locate (or request more) the detail
information from the primary.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
---
 drivers/bus/vdev/Makefile |   1 +
 drivers/bus/vdev/vdev.c   | 134 ++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 125 insertions(+), 10 deletions(-)

diff --git a/drivers/bus/vdev/Makefile b/drivers/bus/vdev/Makefile
index 24d424a..bd0bb89 100644
--- a/drivers/bus/vdev/Makefile
+++ b/drivers/bus/vdev/Makefile
@@ -10,6 +10,7 @@ LIB = librte_bus_vdev.a
 
 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
 
 # versioning export map
 EXPORT_MAP := rte_bus_vdev_version.map
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 1d1c642..dd40218 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -224,8 +224,8 @@ alloc_devargs(const char *name, const char *args)
 	return devargs;
 }
 
-int
-rte_vdev_init(const char *name, const char *args)
+static int
+insert_vdev(const char *name, const char *args, struct rte_vdev_device **p_dev)
 {
 	struct rte_vdev_device *dev;
 	struct rte_devargs *devargs;
@@ -257,6 +257,33 @@ rte_vdev_init(const char *name, const char *args)
 	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
 	rte_spinlock_unlock(&vdev_device_list_lock);
 
+	TAILQ_INSERT_TAIL(&devargs_list, devargs, next);
+
+	if (p_dev)
+		*p_dev = dev;
+
+	return 0;
+
+fail:
+	free(devargs->args);
+	free(devargs);
+	free(dev);
+	return ret;
+}
+
+int
+rte_vdev_init(const char *name, const char *args)
+{
+	struct rte_vdev_device *dev;
+	struct rte_devargs *devargs;
+	int ret;
+
+	ret = insert_vdev(name, args, &dev);
+	if (ret < 0)
+		return ret;
+
+	devargs = dev->device.devargs;
+
 	ret = vdev_probe_all_drivers(dev);
 	if (ret) {
 		if (ret > 0)
@@ -265,17 +292,14 @@ rte_vdev_init(const char *name, const char *args)
 		rte_spinlock_lock(&vdev_device_list_lock);
 		TAILQ_REMOVE(&vdev_device_list, dev, next);
 		rte_spinlock_unlock(&vdev_device_list_lock);
-		goto fail;
-	}
 
-	TAILQ_INSERT_TAIL(&devargs_list, devargs, next);
+		TAILQ_REMOVE(&devargs_list, devargs, next);
 
-	return 0;
+		free(devargs->args);
+		free(devargs);
+		free(dev);
+	}
 
-fail:
-	free(devargs->args);
-	free(devargs);
-	free(dev);
 	return ret;
 }
 
@@ -333,6 +357,68 @@ rte_vdev_uninit(const char *name)
 	return 0;
 }
 
+struct vdev_param {
+#define VDEV_SCAN_REQ	1
+#define VDEV_SCAN_ONE	2
+#define VDEV_SCAN_REP	3
+	int type;
+	int num;
+	char name[RTE_DEV_NAME_MAX_LEN];
+};
+
+static int vdev_plug(struct rte_device *dev);
+
+static int
+vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_vdev_device *dev;
+	struct rte_mp_msg mp_resp;
+	struct vdev_param *ou = (struct vdev_param *)&mp_resp.param;
+	const struct vdev_param *in = (const struct vdev_param *)mp_msg->param;
+	const char *devname;
+	int num;
+
+	strcpy(mp_resp.name, "vdev");
+	mp_resp.len_param = sizeof(*ou);
+	mp_resp.num_fds = 0;
+
+	switch (in->type) {
+	case VDEV_SCAN_REQ:
+		ou->type = VDEV_SCAN_ONE;
+		ou->num = 1;
+		num = 0;
+
+		rte_spinlock_lock(&vdev_device_list_lock);
+		TAILQ_FOREACH(dev, &vdev_device_list, next) {
+			devname = rte_vdev_device_name(dev);
+			if (strlen(devname) == 0)
+				VDEV_LOG(INFO, "vdev with no name is not sent");
+			VDEV_LOG(INFO, "send vdev, %s", devname);
+			strncpy(ou->name, devname, RTE_DEV_NAME_MAX_LEN);
+			if (rte_mp_sendmsg(&mp_resp) < 0)
+				VDEV_LOG(ERR, "send vdev, %s, failed, %s",
+					 devname, strerror(rte_errno));
+			num++;
+		}
+		rte_spinlock_unlock(&vdev_device_list_lock);
+
+		ou->type = VDEV_SCAN_REP;
+		ou->num = num;
+		if (rte_mp_reply(&mp_resp, peer) < 0)
+			VDEV_LOG(ERR, "Failed to reply a scan request");
+		break;
+	case VDEV_SCAN_ONE:
+		VDEV_LOG(INFO, "receive vdev, %s", in->name);
+		if (insert_vdev(in->name, NULL, NULL) < 0)
+			VDEV_LOG(ERR, "failed to add vdev, %s", in->name);
+		break;
+	default:
+		VDEV_LOG(ERR, "vdev cannot recognize this message");
+	}
+
+	return 0;
+}
+
 static int
 vdev_scan(void)
 {
@@ -340,6 +426,34 @@ vdev_scan(void)
 	struct rte_devargs *devargs;
 	struct vdev_custom_scan *custom_scan;
 
+	if (rte_mp_action_register("vdev", vdev_action) < 0 &&
+	    rte_errno != EEXIST) {
+		VDEV_LOG(ERR, "vdev fails to add action");
+		return -1;
+	}
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		struct rte_mp_msg mp_req, *mp_rep;
+		struct rte_mp_reply mp_reply;
+		struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
+		struct vdev_param *req = (struct vdev_param *)mp_req.param;
+		struct vdev_param *resp;
+
+		strcpy(mp_req.name, "vdev");
+		mp_req.len_param = sizeof(*req);
+		mp_req.num_fds = 0;
+		req->type = VDEV_SCAN_REQ;
+		if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) == 0 &&
+		    mp_reply.nb_received == 1) {
+			mp_rep = &mp_reply.msgs[0];
+			resp = (struct vdev_param *)mp_rep->param;
+			VDEV_LOG(INFO, "Received %d vdevs", resp->num);
+		} else
+			VDEV_LOG(ERR, "Failed to request vdev from primary");
+
+		/* Fall through to allow private vdevs in secondary process */
+	}
+
 	/* call custom scan callbacks if any */
 	rte_spinlock_lock(&vdev_custom_scan_lock);
 	TAILQ_FOREACH(custom_scan, &vdev_custom_scans, next) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 4/5] drivers/net: not use private eth dev data
  2018-04-05 17:44 [PATCH v2 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
                   ` (2 preceding siblings ...)
  2018-04-05 17:45 ` [PATCH v2 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
@ 2018-04-05 17:45 ` Jianfeng Tan
  2018-04-05 17:45 ` [PATCH v2 5/5] drivers/net: share vdev data to secondary process Jianfeng Tan
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Jianfeng Tan @ 2018-04-05 17:45 UTC (permalink / raw)
  To: dev
  Cc: thomas, Jianfeng Tan, John W . Linville, Ferruh Yigit,
	Tetsuya Mukawa, Santosh Shukla, Jerin Jacob, Pascal Mazon,
	Maxime Coquelin, Bruce Richardson, Rahul Lakkireddy

We introduced private rte_eth_dev_data to allow vdev to be created
both in primary process and secondary process(es). This is not
friendly to multi-process model, for example, it leads to port id
contention issue if two processes both find the data entry is free.

And to get stats of primary vdev in secondary, we must allocate
from the pre-defined array so that we can find it.

Cc: John W. Linville <linville@tuxdriver.com>
Cc: Ferruh Yigit <ferruh.yigit@intel.com>
Cc: Tetsuya Mukawa <mtetsuyah@gmail.com>
Cc: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Cc: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Cc: Pascal Mazon <pascal.mazon@6wind.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Bruce Richardson <bruce.richardson@intel.com>
Cc: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>

Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
---
 drivers/net/af_packet/rte_eth_af_packet.c | 26 +++++++-------------------
 drivers/net/cxgbe/cxgbe_main.c            |  1 -
 drivers/net/kni/rte_eth_kni.c             | 14 ++------------
 drivers/net/null/rte_eth_null.c           | 19 ++++---------------
 drivers/net/octeontx/octeontx_ethdev.c    | 15 ++-------------
 drivers/net/pcap/rte_eth_pcap.c           | 19 +++----------------
 drivers/net/ring/rte_eth_ring.c           | 17 +----------------
 drivers/net/tap/rte_eth_tap.c             | 11 +----------
 drivers/net/vhost/rte_eth_vhost.c         | 19 ++-----------------
 9 files changed, 22 insertions(+), 119 deletions(-)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
index 57eccfd..110e8a5 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -564,25 +564,17 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 		RTE_LOG(ERR, PMD,
 			"%s: no interface specified for AF_PACKET ethdev\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 
 	RTE_LOG(INFO, PMD,
 		"%s: creating AF_PACKET-backed ethdev on numa socket %u\n",
 		name, numa_node);
 
-	/*
-	 * now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		goto error_early;
-
 	*internals = rte_zmalloc_socket(name, sizeof(**internals),
 	                                0, numa_node);
 	if (*internals == NULL)
-		goto error_early;
+		return -1;
 
 	for (q = 0; q < nb_queues; q++) {
 		(*internals)->rx_queue[q].map = MAP_FAILED;
@@ -604,24 +596,24 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 		RTE_LOG(ERR, PMD,
 			"%s: I/F name too long (%s)\n",
 			name, pair->value);
-		goto error_early;
+		return -1;
 	}
 	if (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {
 		RTE_LOG(ERR, PMD,
 			"%s: ioctl failed (SIOCGIFINDEX)\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 	(*internals)->if_name = strdup(pair->value);
 	if ((*internals)->if_name == NULL)
-		goto error_early;
+		return -1;
 	(*internals)->if_index = ifr.ifr_ifindex;
 
 	if (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {
 		RTE_LOG(ERR, PMD,
 			"%s: ioctl failed (SIOCGIFHWADDR)\n",
 		        name);
-		goto error_early;
+		return -1;
 	}
 	memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);
 
@@ -775,14 +767,13 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 
 	(*internals)->nb_queues = nb_queues;
 
-	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
+	data = (*eth_dev)->data;
 	data->dev_private = *internals;
 	data->nb_rx_queues = (uint16_t)nb_queues;
 	data->nb_tx_queues = (uint16_t)nb_queues;
 	data->dev_link = pmd_link;
 	data->mac_addrs = &(*internals)->eth_addr;
 
-	(*eth_dev)->data = data;
 	(*eth_dev)->dev_ops = &ops;
 
 	return 0;
@@ -802,8 +793,6 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
 	}
 	free((*internals)->if_name);
 	rte_free(*internals);
-error_early:
-	rte_free(data);
 	return -1;
 }
 
@@ -985,7 +974,6 @@ rte_pmd_af_packet_remove(struct rte_vdev_device *dev)
 	free(internals->if_name);
 
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
index 01a80ac..0a8167c 100644
--- a/drivers/net/cxgbe/cxgbe_main.c
+++ b/drivers/net/cxgbe/cxgbe_main.c
@@ -29,7 +29,6 @@
 #include <rte_ether.h>
 #include <rte_ethdev_driver.h>
 #include <rte_ethdev_pci.h>
-#include <rte_malloc.h>
 #include <rte_random.h>
 #include <rte_dev.h>
 
diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index dc4e65f..6d1a29b 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -337,25 +337,17 @@ eth_kni_create(struct rte_vdev_device *vdev,
 	struct pmd_internals *internals;
 	struct rte_eth_dev_data *data;
 	struct rte_eth_dev *eth_dev;
-	const char *name;
 
 	RTE_LOG(INFO, PMD, "Creating kni ethdev on numa socket %u\n",
 			numa_node);
 
-	name = rte_vdev_device_name(vdev);
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		return NULL;
-
 	/* reserve an ethdev entry */
 	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*internals));
-	if (eth_dev == NULL) {
-		rte_free(data);
+	if (!eth_dev)
 		return NULL;
-	}
 
 	internals = eth_dev->data->dev_private;
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->nb_rx_queues = 1;
 	data->nb_tx_queues = 1;
 	data->dev_link = pmd_link;
@@ -363,7 +355,6 @@ eth_kni_create(struct rte_vdev_device *vdev,
 
 	eth_random_addr(internals->eth_addr.addr_bytes);
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &eth_kni_ops;
 
 	internals->no_request_thread = args->no_request_thread;
@@ -459,7 +450,6 @@ eth_kni_remove(struct rte_vdev_device *vdev)
 	rte_kni_release(internals->kni);
 
 	rte_free(internals);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 73fe8b0..0c7beb8 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -494,7 +494,7 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 {
 	const unsigned nb_rx_queues = 1;
 	const unsigned nb_tx_queues = 1;
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	struct pmd_internals *internals = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
 
@@ -511,19 +511,10 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 	RTE_LOG(INFO, PMD, "Creating null ethdev on numa socket %u\n",
 		dev->device.numa_node);
 
-	/* now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(rte_vdev_device_name(dev), sizeof(*data), 0,
-		dev->device.numa_node);
-	if (!data)
-		return -ENOMEM;
-
 	eth_dev = rte_eth_vdev_allocate(dev, sizeof(*internals));
-	if (!eth_dev) {
-		rte_free(data);
+	if (!eth_dev)
 		return -ENOMEM;
-	}
+
 	/* now put it all together
 	 * - store queue data in internals,
 	 * - store numa_node info in ethdev data
@@ -544,13 +535,12 @@ eth_dev_null_create(struct rte_vdev_device *dev,
 
 	rte_memcpy(internals->rss_key, default_rss_key, 40);
 
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->nb_rx_queues = (uint16_t)nb_rx_queues;
 	data->nb_tx_queues = (uint16_t)nb_tx_queues;
 	data->dev_link = pmd_link;
 	data->mac_addrs = &internals->eth_addr;
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &ops;
 
 	/* finally assign rx and tx ops */
@@ -668,7 +658,6 @@ rte_pmd_null_remove(struct rte_vdev_device *dev)
 		return -1;
 
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/octeontx/octeontx_ethdev.c b/drivers/net/octeontx/octeontx_ethdev.c
index 90dd249..0df1735 100644
--- a/drivers/net/octeontx/octeontx_ethdev.c
+++ b/drivers/net/octeontx/octeontx_ethdev.c
@@ -1025,7 +1025,7 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 	char octtx_name[OCTEONTX_MAX_NAME_LEN];
 	struct octeontx_nic *nic = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	const char *name = rte_vdev_device_name(dev);
 
 	PMD_INIT_FUNC_TRACE();
@@ -1041,13 +1041,6 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 		return 0;
 	}
 
-	data = rte_zmalloc_socket(octtx_name, sizeof(*data), 0, socket_id);
-	if (data == NULL) {
-		octeontx_log_err("failed to allocate devdata");
-		res = -ENOMEM;
-		goto err;
-	}
-
 	nic = rte_zmalloc_socket(octtx_name, sizeof(*nic), 0, socket_id);
 	if (nic == NULL) {
 		octeontx_log_err("failed to allocate nic structure");
@@ -1083,11 +1076,9 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 	eth_dev->data->kdrv = RTE_KDRV_NONE;
 	eth_dev->data->numa_node = dev->device.numa_node;
 
-	rte_memcpy(data, (eth_dev)->data, sizeof(*data));
+	data = eth_dev->data;
 	data->dev_private = nic;
-
 	data->port_id = eth_dev->data->port_id;
-	snprintf(data->name, sizeof(data->name), "%s", eth_dev->data->name);
 
 	nic->ev_queues = 1;
 	nic->ev_ports = 1;
@@ -1106,7 +1097,6 @@ octeontx_create(struct rte_vdev_device *dev, int port, uint8_t evdev,
 		goto err;
 	}
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &octeontx_dev_ops;
 
 	/* Finally save ethdev pointer to the NIC structure */
@@ -1174,7 +1164,6 @@ octeontx_remove(struct rte_vdev_device *dev)
 
 		rte_free(eth_dev->data->mac_addrs);
 		rte_free(eth_dev->data->dev_private);
-		rte_free(eth_dev->data);
 		rte_eth_dev_release_port(eth_dev);
 		rte_event_dev_close(nic->evdev);
 	}
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index c1571e1..8740d52 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -773,27 +773,16 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 		struct pmd_internals **internals,
 		struct rte_eth_dev **eth_dev)
 {
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	unsigned int numa_node = vdev->device.numa_node;
-	const char *name;
 
-	name = rte_vdev_device_name(vdev);
 	RTE_LOG(INFO, PMD, "Creating pcap-backed ethdev on numa socket %d\n",
 		numa_node);
 
-	/* now do all data allocation - for eth_dev structure
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		return -1;
-
 	/* reserve an ethdev entry */
 	*eth_dev = rte_eth_vdev_allocate(vdev, sizeof(**internals));
-	if (*eth_dev == NULL) {
-		rte_free(data);
+	if (!(*eth_dev))
 		return -1;
-	}
 
 	/* now put it all together
 	 * - store queue data in internals,
@@ -802,7 +791,7 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 	 * - and point eth_dev structure to new eth_dev_data structure
 	 */
 	*internals = (*eth_dev)->data->dev_private;
-	rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
+	data = (*eth_dev)->data;
 	data->nb_rx_queues = (uint16_t)nb_rx_queues;
 	data->nb_tx_queues = (uint16_t)nb_tx_queues;
 	data->dev_link = pmd_link;
@@ -812,7 +801,6 @@ pmd_init_internals(struct rte_vdev_device *vdev,
 	 * NOTE: we'll replace the data element, of originally allocated
 	 * eth_dev so the rings are local per-process
 	 */
-	(*eth_dev)->data = data;
 	(*eth_dev)->dev_ops = &ops;
 
 	return 0;
@@ -1020,7 +1008,6 @@ pmd_pcap_remove(struct rte_vdev_device *dev)
 		return -1;
 
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index df13c44..e53823a 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -259,15 +259,6 @@ do_eth_dev_ring_create(const char *name,
 	RTE_LOG(INFO, PMD, "Creating rings-backed ethdev on numa socket %u\n",
 			numa_node);
 
-	/* now do all data allocation - for eth_dev structure, dummy pci driver
-	 * and internal (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL) {
-		rte_errno = ENOMEM;
-		goto error;
-	}
-
 	rx_queues_local = rte_zmalloc_socket(name,
 			sizeof(void *) * nb_rx_queues, 0, numa_node);
 	if (rx_queues_local == NULL) {
@@ -301,10 +292,8 @@ do_eth_dev_ring_create(const char *name,
 	 * - point eth_dev_data to internals
 	 * - and point eth_dev structure to new eth_dev_data structure
 	 */
-	/* NOTE: we'll replace the data element, of originally allocated eth_dev
-	 * so the rings are local per-process */
 
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
+	data = eth_dev->data;
 	data->rx_queues = rx_queues_local;
 	data->tx_queues = tx_queues_local;
 
@@ -326,7 +315,6 @@ do_eth_dev_ring_create(const char *name,
 	data->dev_link = pmd_link;
 	data->mac_addrs = &internals->address;
 
-	eth_dev->data = data;
 	eth_dev->dev_ops = &ops;
 	data->kdrv = RTE_KDRV_NONE;
 	data->numa_node = numa_node;
@@ -342,7 +330,6 @@ do_eth_dev_ring_create(const char *name,
 error:
 	rte_free(rx_queues_local);
 	rte_free(tx_queues_local);
-	rte_free(data);
 	rte_free(internals);
 
 	return -1;
@@ -675,8 +662,6 @@ rte_pmd_ring_remove(struct rte_vdev_device *dev)
 	rte_free(eth_dev->data->tx_queues);
 	rte_free(eth_dev->data->dev_private);
 
-	rte_free(eth_dev->data);
-
 	rte_eth_dev_release_port(eth_dev);
 	return 0;
 }
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index ed6d738..7ad117d 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1353,12 +1353,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 
 	RTE_LOG(DEBUG, PMD, "  TAP device on numa %u\n", rte_socket_id());
 
-	data = rte_zmalloc_socket(tap_name, sizeof(*data), 0, numa_node);
-	if (!data) {
-		RTE_LOG(ERR, PMD, "TAP Failed to allocate data\n");
-		goto error_exit_nodev;
-	}
-
 	dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
 	if (!dev) {
 		RTE_LOG(ERR, PMD, "TAP Unable to allocate device struct\n");
@@ -1378,7 +1372,7 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	}
 
 	/* Setup some default values */
-	rte_memcpy(data, dev->data, sizeof(*data));
+	data = dev->data;
 	data->dev_private = pmd;
 	data->dev_flags = RTE_ETH_DEV_INTR_LSC;
 	data->numa_node = numa_node;
@@ -1389,7 +1383,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	data->nb_rx_queues = 0;
 	data->nb_tx_queues = 0;
 
-	dev->data = data;
 	dev->dev_ops = &ops;
 	dev->rx_pkt_burst = pmd_rx_burst;
 	dev->tx_pkt_burst = pmd_tx_burst;
@@ -1535,7 +1528,6 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name,
 	RTE_LOG(ERR, PMD, "TAP Unable to initialize %s\n",
 		rte_vdev_device_name(vdev));
 
-	rte_free(data);
 	return -EINVAL;
 }
 
@@ -1730,7 +1722,6 @@ rte_pmd_tap_remove(struct rte_vdev_device *dev)
 
 	close(internals->ioctl_sock);
 	rte_free(eth_dev->data->dev_private);
-	rte_free(eth_dev->data);
 
 	rte_eth_dev_release_port(eth_dev);
 
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 11b6076..2aebe36 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1039,7 +1039,7 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	int16_t queues, const unsigned int numa_node, uint64_t flags)
 {
 	const char *name = rte_vdev_device_name(dev);
-	struct rte_eth_dev_data *data = NULL;
+	struct rte_eth_dev_data *data;
 	struct pmd_internal *internal = NULL;
 	struct rte_eth_dev *eth_dev = NULL;
 	struct ether_addr *eth_addr = NULL;
@@ -1049,13 +1049,6 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	RTE_LOG(INFO, PMD, "Creating VHOST-USER backend on numa socket %u\n",
 		numa_node);
 
-	/* now do all data allocation - for eth_dev structure and internal
-	 * (private) data
-	 */
-	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
-	if (data == NULL)
-		goto error;
-
 	list = rte_zmalloc_socket(name, sizeof(*list), 0, numa_node);
 	if (list == NULL)
 		goto error;
@@ -1097,12 +1090,7 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 	rte_spinlock_init(&vring_state->lock);
 	vring_states[eth_dev->data->port_id] = vring_state;
 
-	/* We'll replace the 'data' originally allocated by eth_dev. So the
-	 * vhost PMD resources won't be shared between multi processes.
-	 */
-	rte_memcpy(data, eth_dev->data, sizeof(*data));
-	eth_dev->data = data;
-
+	data = eth_dev->data;
 	data->nb_rx_queues = queues;
 	data->nb_tx_queues = queues;
 	internal->max_queues = queues;
@@ -1143,7 +1131,6 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char *iface_name,
 		rte_eth_dev_release_port(eth_dev);
 	rte_free(internal);
 	rte_free(list);
-	rte_free(data);
 
 	return -1;
 }
@@ -1274,8 +1261,6 @@ rte_pmd_vhost_remove(struct rte_vdev_device *dev)
 	rte_free(vring_states[eth_dev->data->port_id]);
 	vring_states[eth_dev->data->port_id] = NULL;
 
-	rte_free(eth_dev->data);
-
 	rte_eth_dev_release_port(eth_dev);
 
 	return 0;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 5/5] drivers/net: share vdev data to secondary process
  2018-04-05 17:44 [PATCH v2 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
                   ` (3 preceding siblings ...)
  2018-04-05 17:45 ` [PATCH v2 4/5] drivers/net: not use private eth dev data Jianfeng Tan
@ 2018-04-05 17:45 ` Jianfeng Tan
  2018-04-12 23:30 ` [PATCH v2 0/5] allow procinfo and pdump on eth vdev Thomas Monjalon
  2018-04-17  2:24 ` Zhang, Qi Z
  6 siblings, 0 replies; 9+ messages in thread
From: Jianfeng Tan @ 2018-04-05 17:45 UTC (permalink / raw)
  To: dev; +Cc: thomas, Jianfeng Tan

dpdk-procinfo, as a secondary process, cannot fetch stats for vdev.

This patch enables that by attaching the port from the shared data.
We also fill the eth dev ops, with only some ops works in secondary
process, for example, stats_get().

Note that, we still cannot Rx/Tx packets on the ports which do not
support multi-process.

Reported-by: Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
---
 drivers/net/af_packet/rte_eth_af_packet.c | 17 +++++++++++++++--
 drivers/net/bonding/rte_eth_bond_pmd.c    | 13 +++++++++++++
 drivers/net/failsafe/failsafe.c           | 14 ++++++++++++++
 drivers/net/kni/rte_eth_kni.c             | 12 ++++++++++++
 drivers/net/null/rte_eth_null.c           | 13 +++++++++++++
 drivers/net/octeontx/octeontx_ethdev.c    | 14 ++++++++++++++
 drivers/net/pcap/rte_eth_pcap.c           | 13 +++++++++++++
 drivers/net/softnic/rte_eth_softnic.c     | 19 ++++++++++++++++---
 drivers/net/tap/rte_eth_tap.c             | 13 +++++++++++++
 drivers/net/vhost/rte_eth_vhost.c         | 17 +++++++++++++++--
 10 files changed, 138 insertions(+), 7 deletions(-)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
index 110e8a5..b394d3c 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -915,9 +915,22 @@ rte_pmd_af_packet_probe(struct rte_vdev_device *dev)
 	int ret = 0;
 	struct rte_kvargs *kvlist;
 	int sockfd = -1;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(dev);
+
+	RTE_LOG(INFO, PMD, "Initializing pmd_af_packet for %s\n", name);
 
-	RTE_LOG(INFO, PMD, "Initializing pmd_af_packet for %s\n",
-		rte_vdev_device_name(dev));
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
 
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL) {
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index 9b02850..d820ff6 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -3003,6 +3003,7 @@ bond_probe(struct rte_vdev_device *dev)
 	uint8_t bonding_mode, socket_id/*, agg_mode*/;
 	int  arg_count, port_id;
 	uint8_t agg_mode;
+	struct rte_eth_dev *eth_dev;
 
 	if (!dev)
 		return -EINVAL;
@@ -3010,6 +3011,18 @@ bond_probe(struct rte_vdev_device *dev)
 	name = rte_vdev_device_name(dev);
 	RTE_LOG(INFO, EAL, "Initializing pmd_bond for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &default_dev_ops;
+		return 0;
+	}
+
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev),
 		pmd_bond_init_valid_arguments);
 	if (kvlist == NULL)
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index c499bfb..ea9fdc6 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -294,10 +294,24 @@ static int
 rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
 {
 	const char *name;
+	struct rte_eth_dev *eth_dev;
 
 	name = rte_vdev_device_name(vdev);
 	INFO("Initializing " FAILSAFE_DRIVER_NAME " for %s",
 			name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(vdev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &failsafe_ops;
+		return 0;
+	}
+
 	return fs_eth_dev_create(vdev);
 }
 
diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index 6d1a29b..69e4920 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -405,6 +405,18 @@ eth_kni_probe(struct rte_vdev_device *vdev)
 	params = rte_vdev_device_args(vdev);
 	RTE_LOG(INFO, PMD, "Initializing eth_kni for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &eth_kni_ops;
+		return 0;
+	}
+
 	ret = eth_kni_kvargs_process(&args, params);
 	if (ret < 0)
 		return ret;
diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 0c7beb8..a2278d6 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -596,6 +596,7 @@ rte_pmd_null_probe(struct rte_vdev_device *dev)
 	unsigned packet_size = default_packet_size;
 	unsigned packet_copy = default_packet_copy;
 	struct rte_kvargs *kvlist = NULL;
+	struct rte_eth_dev *eth_dev;
 	int ret;
 
 	if (!dev)
@@ -605,6 +606,18 @@ rte_pmd_null_probe(struct rte_vdev_device *dev)
 	params = rte_vdev_device_args(dev);
 	RTE_LOG(INFO, PMD, "Initializing pmd_null for %s\n", name);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	if (params != NULL) {
 		kvlist = rte_kvargs_parse(params, valid_arguments);
 		if (kvlist == NULL)
diff --git a/drivers/net/octeontx/octeontx_ethdev.c b/drivers/net/octeontx/octeontx_ethdev.c
index 0df1735..0a32026 100644
--- a/drivers/net/octeontx/octeontx_ethdev.c
+++ b/drivers/net/octeontx/octeontx_ethdev.c
@@ -1185,12 +1185,26 @@ octeontx_probe(struct rte_vdev_device *dev)
 	struct rte_event_dev_config dev_conf;
 	const char *eventdev_name = "event_octeontx";
 	struct rte_event_dev_info info;
+	struct rte_eth_dev *eth_dev;
 
 	struct octeontx_vdev_init_params init_params = {
 		OCTEONTX_VDEV_DEFAULT_MAX_NR_PORT
 	};
 
 	dev_name = rte_vdev_device_name(dev);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(dev_name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", dev_name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &octeontx_dev_ops;
+		return 0;
+	}
+
 	res = octeontx_parse_vdev_init_params(&init_params, dev);
 	if (res < 0)
 		return -EINVAL;
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 8740d52..570c9e9 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -898,6 +898,7 @@ pmd_pcap_probe(struct rte_vdev_device *dev)
 	struct rte_kvargs *kvlist;
 	struct pmd_devargs pcaps = {0};
 	struct pmd_devargs dumpers = {0};
+	struct rte_eth_dev *eth_dev;
 	int single_iface = 0;
 	int ret;
 
@@ -908,6 +909,18 @@ pmd_pcap_probe(struct rte_vdev_device *dev)
 	start_cycles = rte_get_timer_cycles();
 	hz = rte_get_timer_hz();
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL)
 		return -1;
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index b0c1341..e324394 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -725,13 +725,26 @@ pmd_probe(struct rte_vdev_device *vdev)
 	uint16_t hard_port_id;
 	int numa_node;
 	void *dev_private;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(vdev);
 
-	RTE_LOG(INFO, PMD,
-		"Probing device \"%s\"\n",
-		rte_vdev_device_name(vdev));
+	RTE_LOG(INFO, PMD, "Probing device \"%s\"\n", name);
 
 	/* Parse input arguments */
 	params = rte_vdev_device_args(vdev);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &pmd_ops;
+		return 0;
+	}
+
 	if (!params)
 		return -EINVAL;
 
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index 7ad117d..e2da324 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1626,10 +1626,23 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
 	char tap_name[RTE_ETH_NAME_MAX_LEN];
 	char remote_iface[RTE_ETH_NAME_MAX_LEN];
 	struct ether_addr user_mac = { .addr_bytes = {0} };
+	struct rte_eth_dev *eth_dev;
 
 	name = rte_vdev_device_name(dev);
 	params = rte_vdev_device_args(dev);
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(params) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
+
 	speed = ETH_SPEED_NUM_10G;
 	snprintf(tap_name, sizeof(tap_name), "%s%d",
 		 DEFAULT_TAP_NAME, tap_unit++);
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 2aebe36..1267047 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1174,9 +1174,22 @@ rte_pmd_vhost_probe(struct rte_vdev_device *dev)
 	int client_mode = 0;
 	int dequeue_zero_copy = 0;
 	int iommu_support = 0;
+	struct rte_eth_dev *eth_dev;
+	const char *name = rte_vdev_device_name(dev);
+
+	RTE_LOG(INFO, PMD, "Initializing pmd_vhost for %s\n", name);
 
-	RTE_LOG(INFO, PMD, "Initializing pmd_vhost for %s\n",
-		rte_vdev_device_name(dev));
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+	    strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (!eth_dev) {
+			RTE_LOG(ERR, PMD, "Failed to probe %s\n", name);
+			return -1;
+		}
+		/* TODO: request info from primary to set up Rx and Tx */
+		eth_dev->dev_ops = &ops;
+		return 0;
+	}
 
 	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
 	if (kvlist == NULL)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/5] allow procinfo and pdump on eth vdev
  2018-04-05 17:44 [PATCH v2 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
                   ` (4 preceding siblings ...)
  2018-04-05 17:45 ` [PATCH v2 5/5] drivers/net: share vdev data to secondary process Jianfeng Tan
@ 2018-04-12 23:30 ` Thomas Monjalon
  2018-04-13 14:39   ` Tan, Jianfeng
  2018-04-17  2:24 ` Zhang, Qi Z
  6 siblings, 1 reply; 9+ messages in thread
From: Thomas Monjalon @ 2018-04-12 23:30 UTC (permalink / raw)
  To: Jianfeng Tan; +Cc: dev

Hi Jinafeng,

05/04/2018 19:44, Jianfeng Tan:
> As we know, we have below limitations in vdev:
>   - dpdk-procinfo cannot get the stats of (most) vdev in primary process;
>   - dpdk-pdump cannot dump the packets for (most) vdev in primary proces;
>   - secondary process cannot use (most) vdev in primary process.
> 
> The very first reason is that the secondary process actually does not know
> the existence of those vdevs as vdevs are chained on a linked list, and
> not shareable to secondary.
> 
> In this patch series, we would like to propose a vdev sharing model like this:
>   - As a secondary process boots, all devices (including vdev) in primary
>     will be automatically shared. After both primary and secondary process
>     booted,
>   - Device add/remove in primary will be translated to device hog plug/unplug
>     event in secondary processes. (TODO)
>   - Device add in secondary
>     * If that kind of device support multi-process, the secondary will
>       request the primary to probe the device and the primary to share
>       it to the secondary. It's not necessary to have secondary-private
>       device in this case. (TODO)
>     * If that kind of device does not support multi-process, the secondary
>       will probe the device by itself, and the port id is shared among
>       all primary/secondary processes.

Are you OK to consider this series for DPDK 18.08?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/5] allow procinfo and pdump on eth vdev
  2018-04-12 23:30 ` [PATCH v2 0/5] allow procinfo and pdump on eth vdev Thomas Monjalon
@ 2018-04-13 14:39   ` Tan, Jianfeng
  0 siblings, 0 replies; 9+ messages in thread
From: Tan, Jianfeng @ 2018-04-13 14:39 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

Hi Thomas,


On 4/13/2018 7:30 AM, Thomas Monjalon wrote:
> Hi Jinafeng,
>
> 05/04/2018 19:44, Jianfeng Tan:
>> As we know, we have below limitations in vdev:
>>    - dpdk-procinfo cannot get the stats of (most) vdev in primary process;
>>    - dpdk-pdump cannot dump the packets for (most) vdev in primary proces;
>>    - secondary process cannot use (most) vdev in primary process.
>>
>> The very first reason is that the secondary process actually does not know
>> the existence of those vdevs as vdevs are chained on a linked list, and
>> not shareable to secondary.
>>
>> In this patch series, we would like to propose a vdev sharing model like this:
>>    - As a secondary process boots, all devices (including vdev) in primary
>>      will be automatically shared. After both primary and secondary process
>>      booted,
>>    - Device add/remove in primary will be translated to device hog plug/unplug
>>      event in secondary processes. (TODO)
>>    - Device add in secondary
>>      * If that kind of device support multi-process, the secondary will
>>        request the primary to probe the device and the primary to share
>>        it to the secondary. It's not necessary to have secondary-private
>>        device in this case. (TODO)
>>      * If that kind of device does not support multi-process, the secondary
>>        will probe the device by itself, and the port id is shared among
>>        all primary/secondary processes.
> Are you OK to consider this series for DPDK 18.08?
>

As you may know, we've started working on this functionality since 
v17.11. To make it work, we split it into several parts:
(1) move vdev into drivers/bus, merged v17.11;
(2) DPDK IPC, merged in v18.02;
(3) the secondary support for vdev (without datapath).

As it's an important feature asked several times by our users, I would 
suggest we target this release.

In the mean time, I will ask the community's help to review and test.

Thanks,
Jianfeng

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/5] allow procinfo and pdump on eth vdev
  2018-04-05 17:44 [PATCH v2 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
                   ` (5 preceding siblings ...)
  2018-04-12 23:30 ` [PATCH v2 0/5] allow procinfo and pdump on eth vdev Thomas Monjalon
@ 2018-04-17  2:24 ` Zhang, Qi Z
  6 siblings, 0 replies; 9+ messages in thread
From: Zhang, Qi Z @ 2018-04-17  2:24 UTC (permalink / raw)
  To: Tan, Jianfeng, dev; +Cc: thomas, Tan, Jianfeng



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jianfeng Tan
> Sent: Friday, April 6, 2018 1:45 AM
> To: dev@dpdk.org
> Cc: thomas@monjalon.net; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: [dpdk-dev] [PATCH v2 0/5] allow procinfo and pdump on eth vdev
> 
> v2:
>   - Add spinlock for vdev device list as suggested by Anatoly.
>   - Add ring, cxgbe and remove the free in each PMDs as suggested by
> Matan.
>   - Rebase on master.
> 
> As we know, we have below limitations in vdev:
>   - dpdk-procinfo cannot get the stats of (most) vdev in primary process;
>   - dpdk-pdump cannot dump the packets for (most) vdev in primary proces;
>   - secondary process cannot use (most) vdev in primary process.
> 
> The very first reason is that the secondary process actually does not know the
> existence of those vdevs as vdevs are chained on a linked list, and not
> shareable to secondary.
> 
> In this patch series, we would like to propose a vdev sharing model like this:
>   - As a secondary process boots, all devices (including vdev) in primary
>     will be automatically shared. After both primary and secondary process
>     booted,
>   - Device add/remove in primary will be translated to device hog
> plug/unplug
>     event in secondary processes. (TODO)
>   - Device add in secondary
>     * If that kind of device support multi-process, the secondary will
>       request the primary to probe the device and the primary to share
>       it to the secondary. It's not necessary to have secondary-private
>       device in this case. (TODO)
>     * If that kind of device does not support multi-process, the secondary
>       will probe the device by itself, and the port id is shared among
>       all primary/secondary processes.
> 
> This patch series don't:
>   - provide secondary data path (Rx/Tx) support for each specific vdev.
> 
> How to test:
> 
> Step 0: start testpmd with a vhost port; and a VM connected to the vhost
> port.
> 
> Step 1: try using dpdk-procinfo to get the stats.
>  $(dpdk-procinfo) --log-level=8 --no-pci -- --stats
> 
> Step 2: try using dpdk-pdump to dump the packets.
>  $(dpdk-pdump) -- --pdump 'port=0,queue=*,rx-dev=/tmp/rx.pcap'
> 
> Jianfeng Tan (5):
>   eal: bring forward multi-process channel init
>   bus/vdev: add lock on vdev device list
>   bus/vdev: bus scan by multi-process channel
>   drivers/net: not use private eth dev data
>   drivers/net: share vdev data to secondary process
> 
>  drivers/bus/vdev/Makefile                 |   1 +
>  drivers/bus/vdev/vdev.c                   | 187
> ++++++++++++++++++++++++++----
>  drivers/net/af_packet/rte_eth_af_packet.c |  43 +++----
>  drivers/net/bonding/rte_eth_bond_pmd.c    |  13 +++
>  drivers/net/cxgbe/cxgbe_main.c            |   1 -
>  drivers/net/failsafe/failsafe.c           |  14 +++
>  drivers/net/kni/rte_eth_kni.c             |  26 +++--
>  drivers/net/null/rte_eth_null.c           |  32 ++---
>  drivers/net/octeontx/octeontx_ethdev.c    |  29 ++---
>  drivers/net/pcap/rte_eth_pcap.c           |  32 ++---
>  drivers/net/ring/rte_eth_ring.c           |  17 +--
>  drivers/net/softnic/rte_eth_softnic.c     |  19 ++-
>  drivers/net/tap/rte_eth_tap.c             |  24 ++--
>  drivers/net/vhost/rte_eth_vhost.c         |  36 +++---
>  lib/librte_eal/bsdapp/eal/eal.c           |  23 ++--
>  lib/librte_eal/linuxapp/eal/eal.c         |  23 ++--
>  16 files changed, 354 insertions(+), 166 deletions(-)
> 
> --
> 2.7.4

Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>

Also tested with af_packet and dpdk-procinfo / dpdk-dump, patches works as expected.

Regards
Qi

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-04-17  2:24 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-05 17:44 [PATCH v2 0/5] allow procinfo and pdump on eth vdev Jianfeng Tan
2018-04-05 17:44 ` [PATCH v2 1/5] eal: bring forward multi-process channel init Jianfeng Tan
2018-04-05 17:45 ` [PATCH v2 2/5] bus/vdev: add lock on vdev device list Jianfeng Tan
2018-04-05 17:45 ` [PATCH v2 3/5] bus/vdev: bus scan by multi-process channel Jianfeng Tan
2018-04-05 17:45 ` [PATCH v2 4/5] drivers/net: not use private eth dev data Jianfeng Tan
2018-04-05 17:45 ` [PATCH v2 5/5] drivers/net: share vdev data to secondary process Jianfeng Tan
2018-04-12 23:30 ` [PATCH v2 0/5] allow procinfo and pdump on eth vdev Thomas Monjalon
2018-04-13 14:39   ` Tan, Jianfeng
2018-04-17  2:24 ` Zhang, Qi Z

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.